DT1C 

pi.ECTE  § 
APR  09  1990 


NATIONS  OF  NON-PARAMETRIC  DENSITY  ESTIM 


DISSERTATION 

Ahmed  Moharaed  Mohamed  Sultan 
Lieutenant  Colonel,  Egyptian  Air  Force 


AFIT/DS /£N  C/90-1 


DISraigunON  STATEMENT  A 
Approved  fox  pubiic  releaaej 


Diftzxecnca  Unlimited 

DEPARTMENT  OF  THE  AIR  FORCE 

AIR  UNIVERSITY 

AIR  FORCE  INSTITUTE  OF  TECHNOLOGY 


Wright-Patterson  Air  Force  Base,  Ohio 


90  04  05  183 


AFIT/DS/ENC/90-1 


APPLICATIONS  OF  NON-PARAMETRIC  DENSITY  ESTIMATION 


DISSERTATION 

Ahmed  Mohamed  Mohamed  Sultan 
Lieutenant  Colonel,  Egyptian  Air  Force 

AFIT/DS/ENC/90-1 


Approved  for  public  release;  distribution  unlimited 


A  FIT /DS/ENC/90- 1 


APPLICATIONS  OF  NON-PARAMETRIC  DENSITY 

ESTIMATION 


DISSERTATION 


Presented  to  the  Faculty  of  the  School  of  Engineering 
of  the  Air  Force  Institute  of  Technology 
Air  University 
In  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of 
Doctor  of  Philosophy 


Ahmed  Mohamed  Mohamed  Sultan,  B.S.  ,  Diploma  Degree  ,  M.S. 
Lieutenant  Colonel,  Egyptian  Air  Force 


February  1990 


Approved  for  public  release;  distribution  unlimited 


AFIT/DS/ENC/90-1 


APPLICATIONS  OF  NON-PARAMETRIC  DENSITY 

ESTIMATION 

A.hmed  Mohamed  Sultan,  B.S.  ,  Diploma  Degree  ,  M.S. 

Lieutenant  Colonel,  Egyptian  Air  Force 

Approved: 


Preface 


I  would  like  first  to  thank  Allah  (God)  whose  help  was  more  than  necessary  to 
finish  my  research  and  who  facilitates  all  other  means  for  me. 

I  am  more  than  deeply  indebted  to  the  professor  who  has  a  meaning  to  me 
more  than  any  words  can  express.  This  is  Prof.  A.  II.  Moore  who  had  the  idea  of 
making  nonparametric  density  estimation  accessible  by  the  practical  application  of 
the  theoretical  results  of  the  subject.  The  professor  who  was  always  there  whenever 
I  have  any  kind  of  problems.  I  shall  always  be  proud  of  being  his  student  and  I  shall 
always  remember  his  words  to  me.  It  was  not  just  a  professor  student  relationship, 
but  I  can  simply  say  it  was  more  than  a  father  son  relationship.  I  hope  Prof.  Moore 
is  happy  with  my  final  efTort  arid  output,  after  a  long  preparation  of  my  educational 
experience  that  started  in  1981  and  finished  with  this  research.  With  his  huge  number 
of  publications,  research  and  technical  expertise  he  was  the  best  person  to  get  help 
from  during  various  phases  of  this  research.  I  am  also  so  thankful  to  Dr.  Cain, 
Joseph  P.  to  whom  I  am  indebted  for  his  interest  in  linear  models  when  I  chose  this 
area  as  in}'  master  thesis  research  area.  I  am  so  thankful  to  my  committee  members 
Dr.  Cain,  Dr.  Robinson,  and  Dr.  Bauer  for  their  friendly  spirit,  enthusiasm,  and 
valuable  comments  and  notes  on  the  draft  which  helped  me  obtain  a  better  final 
presentation  of  the  different  ideas  of  the  research. 

Finally  and  most  importantly,  to  my  wife,  Azza.  to  my  son  Mohamed.  my 

iii 


love])'  Egyptian-American  daughter  Dina,  my  beautiful  twins  Patina  and  Maii,  1  like 
to  express  my  love  and  appreciation  for  their  continued  support  and  help  during  my 
study  at  A  FIT. 

Ahmed  Mohamed  Mohamed  Sultan 


IV 


Table  of  Contents 


Preface  .  iii 

Table  of  Contents  .  v 

List  of  Figures  .  viii 

List  of  Tables .  x 

Abstract  .  xii 

I.  Introduction  .  ] 

II.  Survey  Of  Some  Nonparametric  Density  Estimation  Methods  4 

Introduction .  4 

The  Method  Of  Orthogonal  Series .  G 

The  Method  Of  Penalty  Functions .  10 

The  Method  Of  Delta  Sequence  .  13 

The  Nearest  Neighbor  Method  .  14 

Discussion  Of  Different  Methods .  14 

III.  Monte  Carlo  Comparison  For  Some  Distributions  Using  The 

Kernel  Method  .  17 

Introduction .  17 

The  Histogram .  17 

The  Naive  Estimator  .  IS 

The  Kernel  Method  .  19 

Mean  and  Variance  Of  The  Estimator .  25 

v 


Page 

Invariance  Property  Of  The  Kernel  Method .  27 

The  Monte  Carlo  Comparison  .  29 

IV.  Optimal  Choice  Of  The  Smoothing  Parameter .  34 

Introduction .  34 

Methodology .  35 

Y.  Parameter  Estimation  .  58 

Introduction .  5S 

Maximum  Likelihood  Estimation  For  The  Parameters  Of 
The  Three  Parameter  Weibull  Distribution .  58 

Methodology .  02 

Stopping  Criterion .  G3 

Results  .  G4 

VI.  Minimum  Distance  Estimation .  Go 

Introduction .  05 

Minimum  Distance  Estimation  For  The  Three  Parameter 
Weibull  Distribution .  G7 

Methodology .  75 

Results  .  76 

VII.  GOODNESS  OF  FIT  APPLICATION  .  115 

Introduction .  115 

Modified  Goodness  Of  Fit  Test  .  117 

Methodology .  120 

The  Technique  And  The  Results  .  123 

vi 


Pa»e 

\  III.  Adaptive  Nonparametric  Kernel  Density  Estimation  Applica¬ 


tion  .  154 

Introduction .  ]51 

Percentile  Ratios .  154 

An  Adaptive  Methodology .  157 

Appendix  A.  Generation  of  random  deviates  .  101 

1.  Cauchy  Distribution  .  1 G 1 

2.  Logistic  Distribution .  102 

3.  Weibull  Distribution .  102 

Bibliography  .  104 

Vita .  J78 


List  of  Figures 


Figure 


3. 


-1. 


f>. 


i  . 


s 


9. 

10. 
11. 
12. 


13. 


14. 

15. 

16. 

17. 

18. 

19. 

20. 
21. 


A  nonparametric  p.d.f.  for  the  uniform  distribution  with  sample  size  60  14 
A  nonparametric  c.d.f.  for  the  uniform  dist rihut ion  with  sample  size  6(1  15 


A  nonparametric  p.d.f.  lor  the  exponent  ial  dist  rihut  ion  with  sample 
size  60 .  46 

A  nonparametric  c.d.f.  lor  the  exponential  distribution  with  sample 
size  60 .  47 


A  nonparametric  p.d.f.  for  the  Cauchy  distribution  with  sample  size  60 

A  nonparametric  c.d.f.  for  the  Cauchy  distribution  with  sample  size  60 

A  nonparametric  p.d.f.  for  the  double  exponential  distribution  with 
sample  size  60  . 

A  nonparametric  c.d.f.  for  the  double  exponential  distribution  with 
sample  size  60  . 

A  nonparametric  p.d.f.  for  the  logistic  distribution  with  sample  size  60 
A  nonparametric  c.d.f.  for  the  logistic  distribution  with  sample  size  60 
4  nonparametric  p.d.f.  for  the  normal  distribution  with  sample  size  60 
A  nonparametric  c.d.f.  for  the  normal  distribution  with  sample  size  60 

p.d.f  for  \V(10.5.1)  with  N  =  10  . 

C.D.F.  for  \V(10,5,1)  with  N  =  10 . 

p.d.f  for  W(10,5,2)  with  N  =  10  . 

C.D.F.  for  W(10,5,2)  with  N=10 . 

p.d.f  for  \V(  10,5,3)  with  N=10  . 

C.D.F.  for  Wf  10.5.3)  with  N=10 . 

p.d.f  for  W(10,5,4)  with  N =10  . 


18 

40 


50 


51 

52 

53 

54 

55 
SO 
81 

53 

54 
86 
S7 
89 


C.D.F.  for  W(10,5,4)  with  .\'  =  10 .  90 

p.d.f  for  W(10.5,l }  with  N=20  .  92 


viii 


gur-  Page 

22.  C.D.F.  for  \V(  10.5,1)  with  N=20 .  93 

23.  p.d.f  for  \V(  10,5,2)  with  N=20  .  95 

2-1.  C.D.F.  for  \V(  10.5,2)  with  X =20 .  9fi 

25.  p.d.f  for  W ’(10.5.3)  with  N=20  .  98 

20.  C.D.F.  for  \V(  10.5,3)  with  N=20 .  99 

27.  p.d.f  for  W(  10.5.4)  with  N=20  .  101 

28.  C.D.F.  for  \V(  10,5,4)  with  N=20 .  102 

29.  p.d.f  for  W(10. 5.1)  with  N=30  .  101 

30.  C’.D.F'.  for  \V(  10,5,1)  with  N=30 .  ]  (15 

31.  p.d.f  for  W  (10.5.2)  with  N=30  .  107 

32.  C.D.F.  for  W(  10.5.2)  with  N=30 .  108 

33.  p.d.f  for  \V(10.5.3)  with  N  =  30  .  110 

31.  C.D.F.  for  \V(  10,5,3)  with  N=30 .  Ill 

35.  p.d.f  for  W(10.5,4)  with  N=30  .  113 

3G.  C.D.F  for  \V(10.5,4)  with  N=30  .  114 


List  of  Tables 


Table  l^gf' 

1.  DifTerent  Kernels  with  their  Efficiency .  -1 

2.  \ ’allies  of  M1SE  for  different  distributions  with  standard  deviation  based 

on  M.C  size  1000  and  sample  of  size  n  for  each  repetition  .  03 

0.  h  Values  for  Different  Distributions .  ‘10 

-i.  Optima]  h  for  sample  sizes  20  for  Different,  Distributions .  -12 

5.  Percentage  improvement  in  M1SE  relative  to  choice  ol  h  as  for 

different  distributions  .  43 

0.  Results  from  M.C  size  1000  for  sample  size  20 .  77 

7.  Weibull  Sample  (  Shape  =  1.0  and  Sample  Size  =10)  70 

>.  Weibull  Sample  (  Shape  =  2.0  and  Sample  Size  =10)  82 

9.  Weibull  Sample  (  Shape  =  3.0  and  Sample  Size  =10)  85 

10.  Weibull  Sample  (  Shape  =  4.0  and  Sample  Size  =10)  SS 

11.  Weibull  Sample  (  Shape  =1.0  and  Sample  Size  =  20)  91 

12.  Weibull  Sample  (  Shape  =  2.0  and  Sample  Size  =  20)  94 

13.  Weibull  Sample  {  Shape  =  3.0  and  Sample  Size  =  20)  97 

14.  Weibull  Sample  (  Shape  =  4.0  and  Sample  Size  =  20)  100 

15.  Weibull  Sample  (  Shape  =  1.0  and  Sample  Size  =  30)  103 

16.  Weibull  Sample  (  Shape  =  2.0  and  Sample  Size  =  30)  106 

17.  Weibull  Sample  (  Shape  =  3.0  and  Sample  Size  =  30)  109 

18.  Weibull  Sample  (  Shape  =  4.0  and  Sample  Size  =  30)  112 

19.  Critical  Values  for  X=5(5)60  125 

20.  Rower  for  the  normal .  126 

21.  Power  for  the  normal  aginst  other  distributions  X=5 .  127 

22.  Power  for  the  normal  aginst  other  distributions  N  =  10  128 

23.  Power  for  the  normal  aginst  other  distributions  N=15  129 


x 


1  a  ljlc  Page 

24.  Power  for  the  normal  aginst  otlicr  distributions  N=2(J  130 

25.  Power  for  the  normal  aginst  other  distributions  N=25  121 

2(3.  Power  for  the  normal  aginst  other  distributions  N  =  130  132 

27.  Power  for  the  normal  aginst  other  distributions  N=35  133 

28.  Power  for  the  normal  aginst  other  distributions  N=40  131 

29.  Power  for  the  normal  aginst  other  distributions  N=45  135 

30.  Power  for  the  normal  aginst  other  distributions  N=50  130 

31.  Po  wer  for  the  normal  aginst  other  distributions  N=55  137 

32.  Power  for  the  normal  aginst  other  distributions  N=G0  .  138 

33.  Critical  Values  using  AD  .  139 

34.  Power  of  the  test  for  the  normal  using  AD .  140 

35.  Power  of  the  test  using  AD  N=5 .  141 

3G.  Power  of  the  test  using  AD  N  =  10 .  142 

37.  Power  of  the  test  using  AD  N=15 .  143 

38.  Power  of  the  test  using  AD  N=20 .  144 

39.  Power  of  the  test  using  AD  N=25 .  145 

40.  Power  of  the  test  using  AD  N=30 .  146 

41.  Power  of  the  test  using  AD  N=35 .  147 

42.  Power  of  the  test  using  AD  N=40 .  148 

43.  Power  of  the  test  using  AD  N— 45 .  149 

44.  Power  of  the  test  using  AD  N=50 .  150 

45.  Power  of  the  test  using  AD  N=55 .  151 

46.  Power  of  the  test  using  AD  N=60 .  152 

47.  Values  of  Percentile  ratios  for  Different  Distributions .  155 

48.  Suggested  k  for  the  h  value .  157 

49.  A vearge  sample  percentile  ratios  .  158 


50.  MISE  for  the  adaptive  technique  (  with  standard  deviat  ion  in  brackets)  159 


xi 


AFIT/DS/ENC/90-1 


Abstract 

The  dissertation  examines  various  methods  of  nonparametric  density  estima¬ 
tion,  and  nonparametric  kernel  estimation  in  more  detail.  The  consequences  of 
various  kernel  window  width  and  their  effect  on  the  mean  integrated  square  error 
are  examined  using  Monte  Carlo  techniques. 

The  mean  and  the  variance  of  nonparametric  density  estimator  is  derived  for 
symmetric  kernels  with  finite  mean  and  finite  variance.  The  results  also  treat  kernels 
with  varying  window  parameters. 

The  nonparametric  kernel  estimate  was  used  to  obtain  new  estimators  for  the 
three  parameter  Weibull  distribution  using  distance  estimation  arid  the  Cramer- 
von-Mises  statistic.  Comparison  with  maximum  likelihood  estimators  using  a  Monte 
Carlo  sample  of  size  1000  and  various  different  parameters  showed  a  significant  im¬ 
provement  over  the  maximum  likelihood  estimators  in  the  mean  integrated  square 
error  between  the  estimated  distribution  and  the  true  distribution. 

Several  new  goodness  of  fit  tests  are  proposed  using  the  nonparametric  kernel 
estimator  and  the  Cramer-von-Mises  and  the  Anderson  Darling  statistics.  Extensive 
Monte  Carlo  experiments  were  performed  to  obtain  the  critical  values  for  the  test 
and  to  study  the  power  of  the  tests  against  eight  alternative  distributions.  The 
tests  using  the  Anderson  Darling  statistic  showed  greater  power  against  almost  all 
alternative  distributions  studied  than  the  K.S.  test. 

xii 


A  new  non  parametric  kernel  estimator  was  introduced  by  varying  the  window 
width  in  each  tail  portion  of  the  sample.  The  method  permitted  different  window 
width  in  each  tail  portion  and  in  the  center  portion  of  the  sample.  The  method  uses 
separately  the  sample  percentile  ratios  as  a  measure  of  each  tail  length.  The  kernel 
parameter  for  the  tail  sample  values  is  chosen  using  sample  percentile  ratios  for  that 
tail.  1  he  nonparametric  kernel  estimator  results  in  comparable  mean  integrated 
errors  with  the  estimators  developed  earlier. 


APPLICATIONS  OF  NON-PARAMETRIC  DENSITY 

ESTIMATION 


/.  Introduction 

The  idea  of  using  nonparametric  density  estimation  is  a  rich  research  topic, 
both  in  estimation  techniques  and  in  applications.  Two  previous  dissertations  under 
the  supervision  of  Prof.  A.  H.  Moore  studied  density  estimators  with  applications 
(Sweeder,  1982  and  Fuchs,  1984)  . 

A  continuation  of  the  previous  research,  with  the  idea  of  exploring  some  new 
applications  of  the  nonparametric  density  estimation,  using  different  nonparametric 
density  estimators,  is  the  goal  for  this  research. 

This  dissertation  is  divided  into  six  main  parts  (  chapter  II- VII).  The  first 
part  surveys  some  of  the  known  nonparametric  density  estimation  methods  with  the 
aim  of  looking  at  the  different  results  and  deciding  which  of  these  methods  meets 
the  need  for  a  nonparametric  density  estimation  technique  with  the  least  number 
of  parameters  and  the  most  established  theoretical  results.  Among  these  methods 
are  the  orthogonal  series  method,  the  penalty  functions  method,  the  delta  sequence' 
method,  and  the  nearest  neighbor  method.  This  part  is  briefly  concluded  with  a 
descriptive  comparison  from  the  literature  of  these  methods. 


1 


Next,  the  kernel  method  which  has  (1)  only  one  parameter,  (2)  the  best  under¬ 
stood  properties,  (3)  the  invariance  property  with  respect  to  both  location  and  scale, 
and  (4)  is  computationally  effective,  is  discussed  in  chapter  III.  Since  the  choice  ol  the 
kernel  is  not  as  crucial  as  the  choice  of  the  parameter  (window  width  h)  in  the  kernel 
method,  the  Gaussian  kernel  is  chosen  which  has  an  infinite  support  and  solves  the 
problem  of  finding  the  estimated  density  support  when  using  a  kernel  with  a  finite 
support.  In  this  chapter  it  is  also  shown  that  for  the  kernel  method,  the  mean  of  the 
nonparametric  density  is  the  sample  mean,  and  the  variance  of  the  nonparamct  t  ic 
density  is  the  sample  variance  plus  the  kernel  variance.  Since  for  certain  applica¬ 
tions  the  invariance  of  the  density  estimator  is  required,  the  invariance  property  for 
the  kernel  estimator  is  also  shown.  A  suggested  h  is  then  introduced,  based  on  the 
approximate  optimal  choice  of  the  window  width.  A  Monte  Carlo  experiment  is 
designed  with  this  proposed  choice  of  h.  The  mean  integrated  square  error  (MIS In ) 
is  used  as  a  measure  for  the  closeness  of  the  true  density  to  the  estimated  one.  The 
results  from  different  distributions  are  reported  for  sample  sizes  10(10)60. 

In  chapter  IV  a  numerical  optimal  choice  of  h  is  derived  in  the  form  of  a 
constant  multiple  of  the  unbiased  estimator  of  the  standard  deviation  divided  by  the 
fifth  root  of  the  sample  size.  The  different  values  of  the  constant  of  multiplication 
together  with  the  corresponding  h  and  MISE  are  reported  for  various  distributions 
and  a  given  sample  size. 

Chapter  V  and  VI  consider  parameter  estimation  for  the  three  parameter 


2 


Weibull  distribution.  In  chapter  V  the  log-likelihood  equations  are  solved  numer¬ 
ically  using  the  hyrbrid  method.  Chapter  VI  considers  the  use  of  the  minimum 
distance  estimation  technique  to  estimate  the  parameters  of  the  three  parameter 
Weibull  distribution  using  Cramer  von  Mises  statistic  as  a  measure  for  the  closeness 
of  the  density  function  with  parameters  obtained  by  the  maximum  likelihood  method 
and  the  density  function  with  parameters  obtained  by  the  new  minimum  distance 
estimation  method.  Results  from  a  Monte  Carlo  of  size  1000  are  reported  for  both 
methods.  The  results  demonstrate  an  improvement  of  the  new  estimation  technique 
over  the  maximum  likelihood  technique. 

In  chapter  VII  a  new  modified  goodness  of  fit  technique  for  normality  is  in¬ 
troduced.  The  critical  values  for  the  test  are  generated.  The  power  of  the  test  for 
various  alternative  distributions  is  computed. 

Chapter  VIII  introduces  an  adaptive  density  estimation  based  on  the  choice  of 
different  h  for  each  tail  of  the  distribution.  The  sample  percentile  ratios  are  used  as 
a  criterion  for  the  choice  of  h  in  the  tail  values  of  the  sample. 


3 


II.  Survey  Of  Some  Nonparametric  Density  Estimation 


Methods 


Introduction 

A  large  number  of  methods  of  nonparametric  density  estimation  have  been 
proposed.  These  methods  have  the  common  goal  of  estimating  a  density  function 
when  a  set  of  data  is  given.  A  few  types  of  estimates  were  first  proposed  in  Fix  and 
Hodges  (1951).  Although  the  nonparametric  estimates  involve  some  parameters, 
they  are  still  considered  nonparametric  in  the  sense  of  relaxing  the  assumptions 
about  the  distribution  of  the  observed  data. 

Many  different  methods  of  density  estimation  have  been  introduced  and  studied 
for  a  long  time.  Monte  Carlo  comparisons  have  been  done  for  various  nonparatnel  l  ie 
estimators.  A  discussion  of  some  properties  and  basic  results  of  the  following  uon- 
parametric  density  estimation  methods:  orthogonal  series  method,  the  penalty  func¬ 
tion  method,  the  delta  sequence  method,  and  the  nearest  neighborhood  method  will 
be  surveyed  and  discussed  in  this  chapter.  The  survey  and  discussion  in  this  chapter 
follow  essentially  the  discussion  by  Paraska  (1983:27-173).  The  kernel  method  will 
be  treated  and  studied  in  a  separate  chapter  with  a  Monte  Carlo  experiment  lor 
different  sample  sizes  from  various  distributions  since  it  is  the  method  that  will  be 
used  for  the  different  applications  in  the  dissertation  together  with  the  reasons  for 
choosing  this  method. 


4 


In  the  orthogonal  series  method,  the  density  function  is  expressed  in  terms  of 
its  orthogonal  series  expansion  and  by  estimating  the  coefficients  in  the  orthogonal 
expansion  the  estimate  of  the  density  can  be  found. 

In  the  method  of  penalty  functions,  an  estimator  is  obtained  through  optimiz¬ 
ing  (  maximizing  )  the  likelihood  function  of  the  sample  over  densities  such  that  the 
likelihood  function  has  a  finite  maximum  when  the  underlying  density  belongs  to 
the  class  of  density  functions. 

The  delta  sequence  method  is  in  fact  a  generalization  of  other  methods  such 
as  Fourier  inversion  and  the  kernel  method. 

The  nearest  neighborhood  method  is  based  on  fixing  a  constant  r  and  choosing 
the  rth  ordered  distance  of  all  the  observations  from  a  given  point,  then  using  this 
distance  as  a  smoothing  parameter. 

The  method  of  kernels  is  a  widely  used  method  in  applications  with  the  best 
understood  properties.  It  came  from  the  idea  of  the  naive  estimator,  which  is  an 
evolution  of  the  histogram  as  will  be  discussed  later  in  chapter  III. 

The  performance  of  the  different  methods  have  been  reported  in  the  literature 
and  the  methods  studied,  hence  in  the  next  part  a  summary  of  each  of  these  methods 


is  presented. 


The  Method  Of  Orthogonal  Series 

The  idea  of  this  method  is  to  express  the  density  function  in  terms  of  its  or¬ 
thogonal  series  expansion  and  to  find  the  estimate  of  the  underlying  density  through 
estimating  the  coefficients  in  the  orthogonal  expansion. 

This  method  was  first  introduced  by  Cencov  in  a  1962  paper.  To  show  the 
conditions  under  which  it  is  possible  to  expand  a  function  f  in  terms  of  a  set  of  a 
complete  orthonormal  basis,  let  us  assume  that  A'  is  a  space,  M- is  a  a-  algebra  of 
subsets  of  A"  i.e 


0  €  M  , 

(1) 

E  6  M  =>  A’\E  G  M  , 

(2) 

Ej  €  M  =>  UjEj  €  M 

(3) 

where  Uj  is  the  union  over  j 

Henc<=(A ’,M)  will  be  a  measurable  space. Let  n  :  M  — >  [0,  oo]  be  a  measure 
and  the  norm  space  L2(fi)  is  separable,  i.e 

M0)  =  O,  (4) 

C  M=^  /i(uEj)  =  5>(Ej)  (5) 

where  E}  are  disjoint. 


6 


If  V  represents  the  family  of  probability  measures  on  (A',A4)  such  that  the 
Radon  -  Nikodym  derivative dp/ dp  G  L2(p)  Vp  G  V,  let  25  =  {b;,i  >  1}  be  a .complete 
orthonormal  basis  for  L2(p).  Since  B  is  complete,  then  /  =  dp/dfi  can  be  written 
as: 

OO 

fix)  =  52aMx)  (bj 

1=1 

where 


a.  =  J  f(x)bi(x)dn(x) 


=  Efbi(x) 


(') 


which  will  correspondingly  introduce  the  orthogonal  series  estimator  of  f  based  on  a 
random  sample  of  size  n  to  be  defined  as: 


Ln 

/(*)  =  J2 

i=i 


6,(x) 


(8) 


where  is  a  fixed  version  of  S,  and  the  number  of  terms  in  the  expansion  Ln  — +  oo 
as  n  — *  oo  and  where  a,-  is  replaced  by  its  estimator 


at 


Ej[b,(x)) 

r=l 


(9) 


7 


The  properties  of  /  are  studied  in  Bosq  (1970)  and  can  be  briefly  summarized 
in  the  following  points:- 


1.  MISE  — >  0 


lim  /  1/n 
Jx 


L«n 

!>?(*) 

Li=0 


f(x)  d/r(x)  =  0 


where  MISE  is  the  mean  integrated  square  error  defined  as: 


MISE  =  E I [f{x)  -f{x))2dx 


10) 


2.  If 

(i)  f  is  continuous. 

(ii)  B *  =  {b’ ,  i  >  1}  are  continuous  and 


Mn  =  sup  sup  |bi*(x)|  <  oo  ,  n  >  1 
l<l<Z.n  xex 


(hi)  ESi  aib'{x)  unl^Tly  f(x)  ,  and 


(iv)  limn-^oo  M*{L2n/n)  =  0 


then 


lim  sup  E[\f{x)  -  f{x) |2]  =  0 


8 


3.  If 


(i)  B’  is  uniformly  bounded 


v~>co  w  >1  uniformly  ,,  . 

00  £ Zi*A(x)  — »  f{x) 


(iii)  3  m  >  0  3 


fx  b?(x)  f(x)  d/<(x)  >  m,  i  >  1 


(iv)  limn-.oc,  Ln  =  oo  and 


(v)  £~=i  Lnexp{-Xn/Ll)  <  oo,  VA  >0 


then 


d ’  =  sup|/(i)  —  / ( a- )  1  — ♦  0  as  7i  — ♦  oo 


4.  If 

(i)  bi  is  of  bounded  variation  for  all  i  >  1 

(ii)  E“,  aMx)  f(x) 

(iii)  lim*^  Ln  =  +oo 

Ov)  £~  i  exp{-an/Mln  V£j  <  oo  ,  Va  >  0 
with 

Mn  =  sup^KH  supl6A- |bj(x)|  , 

K  =  £"=1/t»l 

then 


9 


oo 


13) 


sup  |f(x)  —  f ( x ) |  *  0  asn  — ♦ 

x€-V 

The  necessary  and  sufficient  conditions  for  the  convergence  of  the  density  es¬ 
timator  using  the  method  of  orthogonal  functions  are  given  by  Bosq  and  Bleuez 
(IhTG).  Finally  the  advantages  and  disadvantages  of  the  method  will  be  stated  in 
the  discussion  section  at  the  end  of  this  chapter. 

The  Method  Of  Penalty  Functions 

This  method  is  characterized  by  applying  the  known  methodology  of  estimation 
.the  maximum  likelihood  method  ,  originally  introduced  by  R.  A.  Fisher  .  which  is 
considered  as  a  universal  method  for  optimal  estimation. 

The  problem  statement  in  this  case  is  to  find  an  estimate  of  the  underlying 
density  function  from  which  a  sample  of  size  n  was  drawn  such  that  the  likelihood 
function  is  maximized. This  is  mathematically  formulated  as: 

Max  l( f|xj, . x„ )  =  Ilf(Xi)  (FI) 

i=i 

where  X] . ,xn  are  i.i.d  random  variables  with  a  common  unknown  density  f  and 

/  is  the  likelihood  function  of  the  sample  .This  likelihood  function  does  not  have  a 
finite  maximum  when  f  belongs  to  the  class  of  density  functions  T  .  This  makes  ii 
necessary  to  set  restrictions  on  J-  to  avoid  that  infinite  solution. 


10 


An  approach  for  using  the  maximum  likelihood  principle  is  by  penalizing  t  hose 
functions  giving  an  infinite  solution.  This  infinite  solution  will  essentially  happen  if 
T  is  a  sequence  of  functions  that  converges  pointwise  to  a  Dirac  -  delta  function.  This 
means  that  the  penalization  wouid  represent  a  way  of  deciding  between  smoothness 
and  goodness  of  fit. 

Now.  define  a  penalty  function  V  \T  >— *  7Z  as  a  real- valued  functional  over  IF: 

n 

also  define  L(f)  —  log  l  =  ]T)  logf(xj)  as  Hie  likelihood  function  and  define 


LP  :  /  t— >  L  —  aP  ,  a  >  0  (15) 

as  the  logarithm  of  the  penalized  likelihood  function. 

Hence,  the  problem  will  be  to  find  a  measurable  function  f  :  7Zn  i —*  IF  3  LP  is 
maximized  .  This  F  is  called  the  maximum  penalized  likelihood  estimator  of  f. 

A  suggested  penalty  function  (Good  and  Gaskin  1971  )  has  the  form: 

+oo 

P(f)  =  J  [f'(x)/f(x)j  dx  (10) 

—  OG 

and  the  problem  is  formulated  as: 

MaxLP(f)  =  L(f)  -  atP(f)  (17) 


11 


subject  to  : 


+  CO 

J  f[x)dx  =  1  (18) 

— oo 

fix)  >  0  (K>) 

f(x,)  >  0  ,V?  =  (20) 


and 


P{f)  <  oo 


(21) 


To  avoid  the  non-negativity  constraint  Good  and  Gaskin  used  the  substit  ution 
i=g 2  which  transforms  the  problem  to: 


subject  to 


n  +oc' 

MaxLP(f)  =  2Y^l°g\g\(xt)  -  4a  f  g‘2[x)dx 


(22) 


J  g2(x)dx  =  1 

“OO 

+oo 

J  g2{x)dx 


<  oo 


(23) 

(24) 


and 


1 9  l(x»)  >0  V  i  =  l,..,n 


bio 


12 


The  estimator  obtained  by  this  method  is  a  spline  function  with  double  expo¬ 
nential  splines  and  knots  at  the  sample  points. 

An  optimal  solution  for  this  problem  which  is  twice  differentiable  with  the 
same  sign  for  all  x  was  derived  by  Ghorai  (1977  ). 

The  Method  Of  Delta  Sequence 

This  method  generalizes  other  different  methods  such  as  Fourier  inversion 
method,  Kernel  method,  Histograms  and  others.  To  define  a  delta  sequence  lei 
$  be  an  element  of  the  class  of  continuous  functions  with  continuous  derivatives  of 
all  orders  i.e  $  6  C°°  with  support  I=(a,b),  a,b  6  R-  ,for  every  x  G  1.  A  =  |(5,(.r,t)| 
is  a  delta  sequence  on  I  if: 

<5,  :  /  — >  /  3  6 ,  is  bounded  measurable  V  i=  1,...  and 


lim  /  <5,(x,t)$(f)  dt  =  3>(x)  (26) 

i— oo  Ji 

An  estimator  based  on  that  method  and  an  i.i.d  sample  xj . xn  from  f(x) 

would  have  the  form: 

f(x)  =  (27) 

n;=i 


which  gives  a  sequence  of  estimators  when  using  the  sequence  A.  This  estimator 
can  give  other  kinds  of  estimators  like  the  ones  mentioned  above  by  a  proper  choice 
of  the  delta  sequence.  The  necessary  and  sufficient  conditions  for  the  asymptotic 


13 


unbiasedness  for  some  delta  sequence  based  estimators  are  given  by  Walter  and  Blum 
( 1 9 7 G  ),  while  the  asymptotic  normality  of  such  estimators  is  studied  by  Watson  and 
Leadbetter  (1964). 

The  Nearest  Neighbor  Method 

This  method  is  based  on  the  choice  of  a  fixed  constant  r,  and  through  ordering 
the  distance  of  each  of  the  n  observations  from  a  given  point  one  will  be  able  to  pick 
the  rth  ordered  distance.  The  mathematical  formulation  for  this  method  comes  from 
the  idea  that  the  number  of  observations  in  an  interval  of  width  2 ivr  centered  at  .?• 
is  exactly  r-l.This  implies  that: 


r  —  1  =  2i vrnf(t) 


(28) 


which  means  that  : 


/(0  = 


r  —  1 
2  wTn 


(29) 


which  gives  the  estimate  of  /(<)  based  on  the  rth  nearest  neighbor.  The  method  does 
not  give  an  estimated  density  that  integrates  to  one.  The  estimator  for  this  method 
has  discontinuous  derivative  at  the  points 


Discussion  Of  Different  Methods 

This  section  surveyed  some  results  about  the  kernel  method,  orthogonal  series 
method,  the  method  of  penalty  functions  and  the  nearest  neighbor  method. 


14 


It  is  well  known  that  in  order  to  have  a  successful  use  of  the  non-parametric 
density  estimation  techniques  there  should  be  a  sufficient  amount  of  data  and  a 
reasonable  information  about  the  form  of  the  underlying  density  function.  A  Monte 
Carlo  study  to  compare  density  estimators  of  both  the  kernel  method  and  the  met  hod 
of  orthogonal  series  for  specific  distributions  ( normal ,  uniform  ....  etc)  is  performed 
by  Kumar  and  Markmann  (1975). 

In  kernel  estimation  one  must  choose  the  kernel  and  the  window  width.  The 
choice  of  the  kernel  does  not  significantly  affect  the  efficiency  of  the  estimator,  how¬ 
ever  the  window  choice  varies  both  the  bias  and  the  variance  of  the  estimator  of  f(.r) 
at  each  value  of  x.  Since  the  underlying  density  is  not  known,  this  means  that,  there 
will  be  no  guarantee  that  the  choice  of  the  window  is  the  optimal  one.  However  the 
kernel  method  gives  an  estimator  which  is  a  density  when  choosing  the  kernel  as  a 
density,  besides  being  computationally  efficient. 

In  orthogonal  series  estimation  one  has  to  choose  a  basis  and  some  cut  of] 
sequence.  The  choice  of  basis  will  affect  the  mean  integrated  square  error.  The 
disadvantage  of  this  method  is  that  the  basis  is  arbitrarily  chosen  independent  of  the 
given  data,  and  the  it  does  not  give  estimates  which  are  densities.  Furthermore,  the 
estimators  could  be  negative.  However  it  is  more  efficient  computationally  than  the 
kernel  method  since  few  terms  give  a  sufficiently  accurate  estimator.  A  cosine-based 
estimate  has  been  suggested  by  Anderson  (1969)  to  have  good  characteristics. 

In  the  method  of  penalty  function,  there  is  some  complexity  involved  in  the 


15 


calculations  of  the  estimator,  however  using  a  discrete  maximum  penalized  likelihood 
it  becomes  less  complex.  An  advantage  of  the  method  is  the  insurance  of  the  non¬ 
negativity  of  the  estimator  since  the  penalty  function  is  a  function  of  the  logarithm 
of  the  density. 

The  nearest  neighbor  method  was  developed  to  find  a  commputationally  fast 
technique  for  estimating  the  density.  Contrary  to  the  kernel  method,  this  method 
over-smooths  the  distribution  tails.  Also,  the  estimates  in  this  case  are  not  every 
where  differentiable  and  in  general  it  does  yield  an  estimator  that  integrates  to  unity. 

However,  after  examining  the  methods  discussed  above  in  detail,  the  kernel 
method  is  chosen  to  be  used  for  the  applications  studied  in  this  reseach  due  to  its 
following  properties: 

(1)  It  is  Scale  and  location  invariant  if  one  chooses  the  parameter  to  be  scale 
invariant. 

(2)  It  gives  a  proper  density  function  when  the  kernel  is  a  density  function. 

(3)  It  does  not  give  a  negative  estimator. 

(4)  It  has  only  one  parameter. 

(5)  It  directly  picks  the  support. 

(6)  It  is  fast  in  computations. 


16 


III.  Monte  Carlo  Comparison  For  Some  Distributions  Using 


The  Kernel  Method 


Introduction 

The  histogram  as  a  basic  model  for  density  estimation,  and  the  naive  estimator 
are  introduced  in  this  chapter.  The  kernel  method  is  then  surveyed  as  being  a 
natural  evolution  of  the  naive  estimator.  Some  basic  properties  and  results  for  the 
kernel  estimator,  together  with  some  different  kernels  are  introduced.  The  mean  and 
variance  of  the  kernel  density  are  derived.  The  invariance  property  for  the  kernel 
method  is  shown.  Finally  a  Monte  Carlo  experiment  is  designed  to  examine  the 
behavior  of  a  set  of  different  distributions  under  a  proposed  choice  for  the  window 
width  of  the  kernel  estimator.  The  experiment  uses  different  distributions  with  the 
mean  integrated  squared  error  as  the  criteria  for  the  comparison. 

The  Histogram 

The  histogram,  if  it  is  constructed  so  it  integrates  to  one,  is  simply  an  esti¬ 
mate  of  the  p.d.f  as  a  function  which  varies  based  on  a  predetermined  division  of 
the  support  of  the  estimator.  It  also  is  expressed  as  a  function  of  the  number  of 
observations  from  a  sample  of  size  n  (Ai,  A2,  •  •  • ,  A„);  in  each  of  the  subdivisions  or 


17 


mesh  of  the  support  in  the  following  way: 


f(x)  =  —-(#ofXiinthesamebinasx) 
nh 


m 


where  h  represents  the  width  of  each  mesh  or  bin  and  known  as  bin  width. 

The  bin  width  h  can  be  allowed  to  vary  in  which  case  the  form  lor  the  estimator 
will  be: 

#ofXt  in  the  same  bin  as  x 
width  of  bin  containing  x 

The  basic  properties  for  this  estimator  can  be  summarized  as: 

-  Simple  and  easy  way  of  data  representation. 

-  It  depends  on  the  choice  of  origin  and  bin  width. 

-  In  bivariate  and  trivariate  samples,  it  depends  on  the  grid  direction  of  the 
cells  (besides  origin  and  bin  width) 


The  Naive  Estimator 

Since  f(x)  can  be  expressed  as  the  limit  of  the  rate  of  change  of  F(x)  then 


/(*) 


F(x  +  At)-F(x) 

lim - 

Atio  At 

F(x  -f  At)  —  F(x  —  At) 

lim - — - 

auo  2  At 


(32) 


18 


Hence,  it  is  reasonable  to  estimate  f(x)  by  f(x)  as 


Fn{x  +  Ax)  -  Fn{x  -  Ax) 

2  Ax 

where 

no.  of  X'.s  <  x  , 

- - ! — = —  (3-1) 

n 

now,  using  the  conventional  notation  for  the  bin  width  as  hn  instead  of  A.r  which 
varies  with  the  sample  size  n  then 


Fn{x) 


/>) 


-^—[no.  of  X[s  e  (x  -  hn,x  +  hn)  /  2] 


(35) 


where  hn  — >  0  as  n  — >  oo 

This  estimator  is  known  as  the  naive  estimator.  The  naive  estimator  can  be 
considered  as  a  histogram  with  each  observation  as  a  center  of  a  sampling  interval. 

This  method  gives  a  discontinuous  estimator  with  jumps  at  X,  ±  hn  and  with 
zero  derivatives  everywhere  else. 


The  Kernel  Method 

In  this  section,  a  more  detailed  discussion  of  the  kernel  estimators  with  their 
properties  is  considered.  The  naive  estimator  involves  the  idea  of  looking  for  a 
function  through  which  one  is  able  to  obtain  a  measure  for  the  count  of  the  number 
of  \V«  ’n  the  interval  (x  -  hn  ,  x  +  hn).  Such  function  is  known  as  kernel  function 


19 


K(.)  satisfying  the  regularity  conditions:- 


(i)  sup  K(x)  <  M  <  oo,  |  x  |  K(x)— >  0  as  |t|  — >  oo. 

+°° 

(ii)  K(x)  is  symmetric,  f  x2I\(x)dx  <  oo. 

-OO 

(iii)  K(x)  has  an  absolutely  integrable  characteristic  function, 
and  the  estimator,  suggested  in  this  case  will  have  the  form:- 


/(*) 


ni 


T.K 

i=i 


(30) 


where  hn  — >  0  as  n  — »  oo. 

The  previous  discussion  gives  a  brief  introduction  to  the  concept.  This  concept 
can  be  summarized,  in  the  case  of  the  univariate  spaces  with  continuous  variables, 
as  placing  a  kernel  at  each  point  of  the  design  sample  {AT, ....,  A’n}-  Averaging  the 
contributions  of  the  different  kernels  at  all  points  of  the  support  results  in  the  kernel 
estimator. 

In  spite  of  the  fact  that  the  kernel  estimator  resolves  the  major  difficulties  with 
the  histograms.  Such  difficulties  are  the  fixed  cell  structure,  the  discontinuities  at 
cell  boundaries,  the  lack  of  tails,  and  the  exponential  increase  of  the  number  of  cells 
with  the  increase  of  the  number  of  variables.  The  kernel  estimator  has  the  problem 
of  the  choice  of  the  proper  hn.  It  is  obvious  that  for  a  fixed  n,  a  large  hn  gives  a 
very  smooth  estimate,  and  a  small  one  gives  an  irregular  estimate.  It  is  noted  that 
as  hn  — *  0  the  nonparametric  density  converges  to  a  series  of  spikes  at  each  of  the 


20 


observations.  This  means  that  a  difficulty  corresponding  to  the  choice  of  the  cell  size 
in  the  histogram  will  remain. 

The  mathematical  properties  for  the  univariate  kernel  estimators  are  well 
known.  These  include  the  bias  and  the  asymptotic  results. 


The  asymptotic  properties  of  such  an  estimator  are  investigated  by  Parzen 
(1962)  .  The  necessary  and  sufficient  conditions  for  the  uniform  consistency  with 
probability  one  for  kernel  estimators  are  studied  by  Nadaraja  (1965)  and  Schuster 
(1970).  Based  on  their  study  for  the  properties  of  the  kernel  estimator  the  following 
theorem  holds:- 

CO 

1.  For  a  kernel  function  Iv(.)  which  is  of  bounded  variation  and  £  exp(—jjhn  ) 

j= i 


converges  V7  >  0. 
Then 


S  =  supx\f(x )  —  f(x) | — »  0  with  probability  1  as  n  — »  00  f  is  uniformly 


continuous. 


Now,  several  results  are  introduced  on  the  consistency  of  kernel  estimator. 
First  define: 

J  =  / 1/  -  /I 

then  the  following  results  hold: 

1.  If  the  kernel  is  Borel  measurable  function  on  7Zn  9  :  I\  >0  ,  / 1<  =  1 

then 


21 


, . ,  .in  probability  _  c  C 

(l)  J  — *  0  as  n  — >  oo  tor  some  1. 

(;i)  J  — *  0  as  n  — >  oo  ,V  f. 

.....  ,  almost  surely  ,  w  f 

(in)  J  — ►  0  as  n  - — >  oo  ,  V  i. 

,.  .  exponentially  w  f 

(iv)  J  — - ►  0  as  n  — >  oo  ,V  1. 

where  the  exponential  convergence  means  :  given  e  >  0  ,3  r  ,  ?io  >  0  3 
P(  J  >  e )  <  exp(-rn)  ,  n  >  no. 

(v)  lim„_oo  hn  =  0  ,  lim„_co  n(h.n)m  =  oo. 

2.  For  any  density  f  on  7lm 

;  K  is  an  absolutely  integrable  function  3  :  /  I\  =  1 
,  limn_oo  ~  0  i  limn_oo  ^  (^n)  — 


then 


T  exponentially  „  , 

J  — >  0  as  n  -+  oo  ,  V  1, 


3.  If  Iv,  f  are  densities  on  7vm  ;  J  ,n  prS^tl,ty  0  as  n  — *  oo 


then 


limn^oc  /i„  =  0  ,  limn^oo  n  (hn)"1  =  oo. 


4.  supz  |/(x)  —  E[f(x)] |  0  as  n  — >  oo  ,  V  distributions  F. 


5.  Let  B(x)  =  E[/(x)]  -  f(x)  be  the  bias  of  the  estimator,  and 


+oo 


—  OO 


22 


where  K(x)  satisfies  : 


(i)  sup;rK(a’)  <  M  <  oo  ;  |x|K(x)  — »  C  as  |x|  — »  oo. 

+  0O 

(ii)  I\(x)  =  K(-x)  ,  x  €  71  ;  J  x2  K(x)dx<  oo. 

—  OO 

If  f  is  a  bounded  density  function  and  if  f  "(x)  exists,  then 

B(l)  =  “j  "*>/”(*) 

The  choice  of  the  smoothing  parameter  h  is  more  crucial  than  the  choice  of  the 
kernel  itself.  The  approximate  MISE  as  a  function  of  h  is  given  by  : 

i  J  K2{i)dt  +  ml  j  J  {f"{x)h2}2  dx  (37) 

which  upon  differentiation  w.r.t  h  and  equating  to  zero  will  give  the  optimal  hopl. 

hopl  =  m?2/5  |  J  /\2(<)df|  jy  /"(x)2dx|  ?r_1/5  (-58) 

The  h  value  gets  bigger  as  the  second  derivative  of  f(x)  gets  smaller  and  consequently 
this  gives  a  smoother  estimator  and  a  smaller  approximate  MISE. 

An  approach  for  the  kernel  choice  that  uses  calculus  of  variation  to  derive  a 
kernel  that  optimizes  the  approximate  MISE  gives  a  kernel  with  an  efficiency  1  which 
is  known  as  Epanechnikov  kernel.  This  kernel  is  given  as: 


23 


A»  = 


—  t)  ~ <  x  <  \/b 
0  otherwise 


Defining  the  efficiency  of  a  kernel  as  the  ratio  of  its  MISE  relative  to  that  of 
Epanechnikov,  the  relative  efficiency  of  different  kernels  are  given  in  Table  1. 


Table  1.  Different  Kernels  with  their  Efficiency 


kernel 

K(x) 

Efficiency 

Epanechnikov 

nn 

1.00 

Boxed 

vf  ,  _  f  \  if  — 1  <  x  <  1 
\  0  otherwise 

0.9295 

Bi  weight 

0.9939 

Gaussian 

K(x)  =  7^exP~  (f ) 

0.9512 

The  idea  of  chosing  a  smoothing  parameter  value  that  will  subjectively  agree 
with  a  priori  information  about  the  underlying  distribution  is  valuable  in  terms  of 
specific  applications,  even  if  it  seems  not  to  be  a  nonparametric  approach.  In  other 
words  the  choice  of  the  smoothing  parameter  in  some  application  can  be  made  In- 
making  use  of  the  information  known  or  at  least  assumed  about  the  distribution 
form. 

Different  approaches  have  been  proposed  to  find  a  reasonable  choice  of  the  h 
parameter.  Among  those  methods  are  the  least  squares  cross  validation,  the  likeli¬ 
hood  cross  validation  and  the  test  graph  method. 


24 


A  simulation  of  a  comparative  study  of  some  of  the  kernel  methods  for  sample 
sizes  25,  50,  and  100  for  different  distributions  of  varying  tail  length  is  presented  by 
Bowman  in  his  1980  paper. 


Mean  and  Variance  Of  The  Estimator 
Theorem 

Let  K ( x )  be  a  symmetric  kernel  with  mean  Ek(x)  and  variance  \  k(x)  such  that 
Eic(. r),  \ '\-(x)  <  oo  and  /  K(x)dx  =  1.  If  f(x)  is  a  nonparametric  kernel  estimator 
based  on  a  sample  of  size  n  (Ad,  •  •  •  ,Xn)  with  K(x)  as  a  kernel,  then 

f(x)  has  a  mean  x  and  a  variance  Vk{x)  +  s2.  where  x  is  the  sample  mean  and 
s2  is  the  sample  variance. 

proof 

The  kernel  estimator  based  on  the  sample  (Ad,  •  •  ■ .  Xn)  is: 


1  di, 

/m  =  sEa- 

i=i 


(89) 


Hence  the  expected  value  of  the  random  variable  x  with  f(x)  as  a  density  function 
will  be: 


E(x) 


dx 


dx 


25 


1  " 
1 £ 


as  for  the  variance  vve  have: 


E{x-x)2 


=  E(x2)-x2 

-  \tj* 


dx  —  x2 


= 

t=l  n 

=  Vk(x)  +  s2 


Corollaries 


1.  The  mean  and  the  variance  for  a  kernel  estimator  with  a  Gaussian  ker¬ 
nel  asymptotically  approaches  the  mean  and  variance  of  the  empirical  distribution 
function.  This  can  be  showm  in  the  following  way: 


Since  in  the  case  of  a  Gaussian  kernel  with  f(x)  given  as: 


26 


the  expected  value  and  the  variance  will  be: 


E(x)  ---  x 

V{x)  =  h2  +  s 2  (43) 

and  since  h — »  0  as  n  — ►  oo  then  E(x)=i  and  V(x)  — ►  s2  which  are  the  mean  and 
the  variance  for  the  empirical  distribution  function. 

2.  For  different  kernels  K,(x)  with  variances  Vt,  i=l,...,n;  each  used  at  one  of 
the  sample  points  X\,  ■  •  ■ ,  Xn  respectively  the  mean  remains  the  sample  mean  while 
the  variance  will  be: 

n*)  =  i;v;-(s)+s2  (44) 

»=i 

The  kernel  estimator  is  location  and  scale  invariant  and  this  property  is  derived  in 
the  following  section: 

Invariance  Property  Of  The  Kernel  Method 

The  invariance  property  for  the  kernel  method  is  shown  in  this  section  under 
two  transformations.  First,  the  location  transformation  where  all  the  observations 
are  moved  either  to  the  left  or  to  the  right.  Second,  the  scale  transformation  where 
all  the  observations  are  either  compressed  or  expanded  by  a  constant  factor. 


27 


a)Tlie  transformation 


Z- 

•—'1 


A'i  -  C 


(•15) 


/(*) 


x~Xj\ 
h  ) 


(46) 


thus 


$(*)  = 


ifK/x  +  c-xj 
nh% 


=  /(*  +  C) 


(47) 


b)The  transformation 


2;  =  A'i/k 


(48) 


since  the  h  value  is  a  linear  function  of  the  sample  standard  deviation,  hence  the 
new  value  h  value  resulting  from  the  transformation  of  the  data  by  a  scale  k  in  the 


28 


above  way  will  be  h  such  that: 


where  A'i,  A'2,  •  -  - ,  A'n  are  independent,  identically  distributed  observations  from  f, 
and  I\(.)  is  a  function  that  satisfies  the  regularity  conditions  stated  on  page  19. 

The  choice  of  the  parameter  h  in  the  kernel  method  is  critical,  since  this  pa¬ 
rameter  controls  the  smoothness  of  the  resulting  estimator. 

The  choice  of  the  h  parameter  for  the  univariate  case  can  frequently  be  chosen 
visually  in  a  satisfactory  manner  (Wahba,  1983).  However,  the  need  for  a  predeter¬ 
mined  choice  of  the  h  parameter  in  most  of  the  applications  of  the  nonparamet ri< 
density  estimation  suggests  the  idea  of  examining  the  behavior  of  some  of  the  dif¬ 
ferent  distributions  under  a  proposed  choice  of  h.  A  Monte  Carlo  experiment  of  size 
1000  is  used  to  examine  the  behavior  of  the  estimators  for  six  different  distributions. 
These  distributions  are: 

-Uniform. 

-Exponential. 

-Cauchy. 

-Double  Exponential. 

-Logistic. 

-Normal. 

The  criteria  chosen  for  the  comparison  is  the  mean  integrated  square  error. 

The  optimum  choice  for  h  is  shown  in  equation  (38)  to  be  a  constant  times 
n~l .  Furthermore,  Silverman  (1986)  shows  that  the  optimum  h  for  the  normal  is 


1 .06(772 ~ *  using  a  normal  kernel.  Therefore,  a  data  dependent  h  equals  to  sit ~  s , 
where  s  represents  the  sample  standard  deviation  for  sample  size  n,  is  chosen  for  the 
Monte  Carlo  experiment  study.  It  also  gives  a  scale  invariant  nonparametric  density 
estimate  since  s  is  a  scale  incariant  estimator  of  a. 

Now,  the  data  based  choice  of  the  h  was  used  with  the  kernel  technique  when 
a  Gaussian  kernel  is  utilized  in  which  case  the  estimator  will  take  the  form: 


/(*)  =  -T^ 

n  h  fr{ 


x  -  X., 


(52) 


where  4>(x)  represents  the  p.d.f  for  the  standard  normal  distribution.  Sample  sizes 
10,20,. ..,60  i.e  10(10)60  are  used  and  MISE  defined  as: 


MISE  =  J  E[f(x)  -  f{x)]2  dx 

=  E  J  [/( x)  -  f(x)]  dx  (53) 

where  f(x)  denotes  the  nonparametric  estimator  based  on  the  previous  choice  of  li. 
while  f(x)  will  be  one  of  the  mentioned  six  distributions. 

To  evaluate  the  performance  of  the  method  over  the  various  distributions,  the 
Monte  Carlo  experiment  is  designed  the  same  way  for  all  the  six  distributions  and 
the  different  sample  sizes. 

The  methodology  is  such  that  a  certain  sample  size  10(10)60  of  each  of  the  dis¬ 
tributions  is  generated  using  the  IMSL  routines  RNUN,  RNEXP,  RNCAU,  RNNOR 


31 


for  the  uniform,  exponential,  caucliy  and  normal  distributions  respectively.  While 
an  inverse  C.D.F  technique  is  used  for  the  double  exponential  and  the  logistic  dis¬ 
tributions.  The  data  based  choice  of  the  smoothing  parameter  is  then  calculated  for 
each  of  the  1000  different  samples.  The  integrated  square  error  1SE  given  as: 


ISE  =  J  f(x)  -  f(x)  dx 


(54) 


is  then  computed  for  each  sample  using  the  IMSL  integration  routine  QDAG1  with 
bounds  — oc  and  oo.  This  is  only  modified  to  be  (-50  ,  +50)  for  the  logistic  dis¬ 
tribution  to  avoid  the  numerical  difficulty  of  computation  beyond  this  limits.  An 
estimate  of  MISE  is  then  obtained  by  averaging  the  ISE  from  the  1000  Monte  Carlo 
repetitions.  Likewise,  an  estimate  of  the  standard  deviation  of  MISE  is  computed. 
The  results  of  the  Monte  Carlo  experiment  for  the  different  sample  sizes  are  given 
in  Table2  where  the  table  entries  give  the  MISE  for  different  sample  sizes  with  the 
standard  deviation  in  brackets. 

The  results  of  the  Monte  Carlo  show  that  the  choice  of  h  which  is  near  optimal 
for  the  normal  ( hopt  for  the  normal  is  1.06<7n-^)  gives  a  comparable  results  for  the 
double  exponential  and  the  logistic  distributions,  while  a  reasonable  fit  was  found 
for  the  Cauchy.  A  relatively  large  MISE  is  obtained  for  the  uniform  and  exponential 
distributions  which  indicates  that  the  choice  for  these  distributions  is  not  as  optimal. 


32 


Table  2.  Values  of  MISE  for  different  distributions  with  standard  deviation  based 


on  M.C  size  1000  and  sample  of  size  n  for  each  repetition 


n 

Uniform 

Expon. 

Cauchy 

D.E 

Logistic 

Nor  m  ill 

10 

0.19142 
( 0.13260 ) 

0.15297 

(0.05^75) 

0.06277 

(0.03842) 

0.04318 

(0.02920) 

0.06412 

(0.06444) 

0.03004 
(0.03561 ) 

20 

0.12664 

(0.05745) 

0.12950 

(0.03629) 

0.06834 
( 0.04147 ) 

0.02672 

(0.01526) 

0.04487 

(0. 03402) 

0.01970 
( 0.01617 ) 

30 

0.10690 

(0.03691) 

0.11982 

(0.02800) 

0.06991 

(0.04038) 

0.02140 
(0. 01146) 

0.04131 

(0.02593) 

0.01107 

(0.01042) 

40 

0.09509 

(0.03028) 

0.11237 

(0.02478) 

0.07368 

(0.04011) 

0.01835 

(0.00067) 

0.03989 
(0.0223 4) 

0.01151 

(0.00843) 

50 

0.08800 
( 0.02429 ) 

0.10575 

(0.02220) 

0.07727 

(0.03953) 

0.01651 

(0.00S25) 

0.03916 
( 0.01934 ) 

0.00996 
(0.00704 ) 

60 

0.08267 

(0.02017) 

0.10191 

(0.01889) 

0.07998 

(0.03921) 

0.01503 

(0.00731) 

0.03904 
( 0.01763 ) 

0.00883 

(0.00635) 

# 


33 


IV.  Optimal  Choice  Of  The  Smoothing  Parameter 


Introduction 

The  choice  of  the  h  parameter  is  the  most  essential  step  in  successful  non- 
pa  rametric  density  estimation  using  the  kernel  method.  This  choice  is  theoretically 
derived  based  on  the  optimization  of  the  approximated  MISE  defined  in  chapter  III. 
In  this  chapter  a  Monte  Carlo  experiment  is  performed  to  approximate  the  optimal 
choice  of  the  h  parameter  for  the  Gaussian  kernel.  This  choice  is  a  crucial  one  in 
terms  of  a  goodness  of  fit  application  besides  any  other  applications  that  require 
a  nonparametric  estimate  of  the  density.  The  different  distributions  considered  are 
the: 

-  Uniform 

-  Exponential 

-  Cauchy 

-  Double  Exponential 

-  Logistic 

-  Normal 

These  distributions  represent  different  shapes  and  characteristics. 

Hence,  the  purpose  of  this  chapter  is  to  find  an  optimal  h  and  the  corresponding 
MISE  for  a  1000  different  samples  each  of  size  20  from  the  above  distributions. 


34 


Methodology 


The  optimal  h  w.r.t  minimizing  the  approximate  MISE  is  given  as: 

hopt  =  m-2/ 5  {/ K2{l)dt^lh  f"{x)2dx |  1/5  n"1/5  (55) 

where  m2  is  the  kernel  second  moment  (see  Parzen,  1962). 

This  approximate  optimal  value  hopt  is  derived  for  the  different  distributions 
as  a  first  step: 

1.  Uniform  distribution 

For  the  uniform  distribution  the  approximate  expression  gives  a  zero  h  since 
the  density  is  constant.  This  case  corresponds  to  the  E.D.F  estimator.  However  the 
M.C  results  indicate  that  this  value  does  differ  from  zero. 

2.  Exponential  distribution 

For  the  one  parameter  exponential  distribution  with  variance  V(x)  =  -j$  and 
p.d.f  given  as: 


f(x)  =  aeax  ,q>0  ,  x>0  (56) 

f'(x)  =  -a2  e~QX  (57) 

f"(x)  =  a3e~ax  (58) 


35 


77?  2  =  1  for  the  Gaussian  kernel 


where  a  is  the  standard  deviation  of  the  distribution.  Hence  by  substituting  in  the 
formula  for  the  approximate  optimal  h,  the  corresponding  h  for  this  distribution  will 


be: 


h-ovt  =  -8918<7  n  ? 


(Gi) 


3.  Cauchy  distribution 

For  the  Cauchy  distribution  with  a  density  f(x),  the  optimal  h  is  derived  below: 


/(*) 

/"(*) 


1 


(x  —  a)2  +  1 


—  oo  <  x  <  oo,  — oo  <  a  <  co 


3ir  ( x  —  a)2  —  1 


7r2  (x  —  a)2  +  1 


(G2) 

(63) 


36 


/5  7\  / 3  9\  / 1  IT 

1  [177.6995  37.6995  (y  -  J  +  25  (y,  y 


=  .1992 

hopt  —  1.0721  n~i 

where  B  is  a  beta  function. 

4.  Double  Exponential  distribution 

For  the  double  exponential  density  given  by: 


(64) 

(65) 


37 


/(*)  = 


e  0 


,0>O 


W 


(00) 


and  similar  to  the  previous  case  the  optimal  approximate  h  is: 


hopt  =  .7244cm  5 


(07) 


5.  Logistic  distribution 


The  logistic  density  function  in  two  parameters  is  given  by: 


f(x)  =  e.rp[—  (x  —  a)  /b]  /[&(!+  exp[—  (a:  —  a)  /  6] ) 2  (08) 


with 

E(x)  =  a  ,  V(x)  =  M  ,  mode(x)  =  a 


/'(*) 


1 

b  c 

62  (l  +  e^)4 

x— a  /„  x— a \ 2 

e  6  (l  +  e  *  )  -2 

(l+e*?*)  .^(*5*) 

izza  , 

e  b  | 

62 

(l+ef*) 

e  i-  1 

(»- 

^1  +  e 
“) 

‘  J 

3 

1 

b2  (l  +  e  »>*  j 


(69) 


38 


1  £jz£ 

l£  b 

(l  —  2eIfca )  62  (l  +  e* >>  )  —  3 b2  (l  +  e 

i-n\2  .  j-a  i-a  /  j-<i  \ 

6  )  6C  "  e  6  l1  “  6  6  j 

e  »  I 

^(l+c4?1)6 

;i-2££e)(l+e£fa)-3c!(i?)(l-e 

63  (l  +  e~  j 

j  — a  | 

e  6 

1  -  -  2e2(^)  -  3e^  +  3e2(^)' 

b 3  +  e~b~ 'j 

T  —  (1 

e 

'1_4e^+e2(^)l 

— 

J 

(70) 

p(\ 

Now.  let.  y  =  exp( ^t2- )  and  hence 

/OO 

f"(x)dx  = 

-OO 


Thus,  the  optimal  h  for  the  Gaussian  kernel  is  given  as: 


hopt  =  1.6396  b  n  * 

=  —  (1.6396) 

7 r 

=  0.9039  a  n-*  (72) 


f°°  y2{  1  -  4y  +  y2)2  ^ 

Vo  66(l  +  y)8  y  ^ 

- (TT^>5 - 4 

i  [5(2, 6)  -  85(3, 5)  +  185(4,4)  -  85(5. 3)  +  5(6, 2)] 

0s 

^  [25(2,6)  -165(3,5)  +  185(4,4)] 

4  (-02381)  (71) 


39 


wiierecr  is  the  distribution  standard  deviation. 


6.  Normal  distribution 

Silverman  (1986)  shows,  as  an  example,  that  the  value  of  h  based  on  the 
previous  approximation  is: 


hopt  =  1.06a  ?i  * 


(?:?) 


The  h  value  obtained  is  summarized  in  the  following  table  (Table3.). 
Table  3.  h  Values  for  Different  Distributions 


Distribution 

h 

Exponential 

.8918  on~i 

Cauchy 

1.0721  n~t 

Double  Exponential 

.7244  an~^ 

Logistic 

.9039  an~'i 

Uniform 

0.0 

Normal 

1.06  on~ ? 

A  numerical  improvement  of  the  previous  recommended  h  is  to  be  found  based 
on  the  use  of  an  unbiased  estimator  for  the  standard  deviation  and  the  use  of  a  linear 
search  around  the  previous  h  value  for  an  h  with  smaller  MISE.  The  general  form  of 
the  proposed  estimator  for  h  is  a  constant  multiple  of  the  unbiased  estimator  of  a 
times  .  Let  d  represents  the  unbiased  estimaior  of  a.  which  is  given  as: 


a  = 


.Vn£i¥) 
'  v^r(t) 


(74) 


40 


where  T  is  the  gamma  function  and  a  is  given  by: 


g  = 


\t(X>  ~  *)' 

1  =  1 


(75) 


thus 


G  — 


N 

v^r(i) 


(Tfi) 


The  Monte  Carlo  experiment  here  is  designed  for  a  sample  size  20  in  which 
case  the  optimal  h  is  assumed  to  be  hopt  =  kcnx~^ .  The  experiment  starts  with 
generating  1000  samples  from  the  6  distributions  (uniform,  exponential,  Cauchy, 
double  exponential,  logistic,  and  normal)  each  of  size  20.  Defining  an  interval  hi 
such  that: 

Ti  —  {hl\hopl  —  l  <  hi  <  hopl  +  u}  (77) 

where  hopt  is  as  defined  in  the  above  table,  a  search  in  the  closed  interval  hi  for  an  hi 
that  minimizes  the  MISE  is  performed.  The  search  starts  by  subdividing  the  interval 
into  a  mesh  of  m  equal  subintervals.  Computing  the  MISE  at  each  of  (m+1)  end 
points  of  the  subintervals  gives  an  array  of  MISE’s.  The  minimum  MISE  corresponds 
to  an  optimal  hi  value  in  H.  If  the  optimal  hi  lies  on  either  end  of  the  interval  7 H 
then  the  search  interval  is  expanded  by  1  or  u  for  lower  or  upper  end  points  of  H 


41 


respectively  and  the  search  continues. 


Upon  finding  the  optimal  hi,  as  described  above,  a  constant  k  is  computed 


as: 


_  i_  \ 

crn  4 

which  defines  the  factor  that  relates  the  choice  of  the  hi  to  the  unbiased  estimator 
ol  the  standard  deviation  and  the  sample  size.  The  following  table  gives  the  average 
optimal  hi,  the  average  k,  and  the  average  MISE  over  the  1000  different  samples 
with  their  standard  deviations  in  brackets. 


Table  4.  Optimal  h  for  sample  sizes  20  for  Different  Distributions 


Distribution 

#  of  samples 

^1  opt 

k 

MISE 

Uniform 

698 

.1629 

(.0515) 

1.0589 

(■4443) 

.1116 

(.0394) 

Exponential 

163 

.2719 

(.0915) 

.5334 

(.1902) 

.0865 

(.0304) 

Cauchy 

163 

7.1591 

(12.8467) 

.9657 

(.0800) 

.0675 

(.04  07) 

Double  Exponential 

600 

.5922 

(.1141) 

.8376 

(.2723) 

.0216 

(.0140) 

Logistic 

347 

.7862 

(.0899) 

1.4821 

(.1142) 

.0211 

(.0164) 

Normal 

1000 

.6160 

(.1008) 

1.1789 

(.3190) 

.0145 

(.0124) 

42 


The  method  shows  an  improvement  over  the  choice  of  h  as  sn  s.  The  percent¬ 
age  improvement  for  each  distribution  is  given  in  the  following  table: 

Table  5.  Percentage  improvement  in  MISE  relative  to  choice  of  h  as  for  dif¬ 

ferent  distributions 


Distribution 

%  improvement 

Uniform 

11 

Exponential 

33.2 

Cauchy 

1 

Double  Exponential 

19 

Logistic 

52 

Normal 

26.4 

This  shows  that  the  choice  of  sn  &  is  rather  good  one  over  the  set  of  distribu¬ 
tions  studied. 

The  following  graphs  show  examples  from  uniform,  normal,  exponential,  and 
logistic  distributions  using  the  constant  k  given  in  table  4. 


43 


-8  -6  -4  -2  0  2  4  G  S 


x 

Figure  5.  A  nonparametric  p.d.f.  for  the  Cauchy  distribution  with  sample  size  GO 


48 


X 


Figure  8.  A  nonparametric  c.d.f.  for  the  double  exponential  distribution  with  sam¬ 
ple  size  60 


Next,  an  example  from  each  distribution  is  given  for  a  sample  of  size  GO.  The 
h  parameter  used  is  h  =  ksn~^.  The  seed  used  for  the  uniform  distribution  is  the 
same  for  the  other  distributions.  The  value  of  the  ISE  for  the  uniform  distribution 
is  .0514,  for  the  Cauchy  is  .0G32,  for  the  double  exponential  is  .0076,  for  the  logistic 
is  .0064,  and  for  the  normal  is  .0012. 

The  uniform  distribution  fit  shows  an  almost  linear  behavior  for  the  C.D.E. 
in  the  interval  [,1,.9],  however  due  to  the  infinite  support  of  the  Gaussian  kernel, 
the  support  of  the  estimated  density  is  [-.4,1.4],  This  is  a  typical  behavior  for  such 
estimator  and  remedial  measures  can  be  taken  to  handle  such  a  case,  however  Un¬ 
objective  is  to  get  a  tool  that  can  be  used  in  Monte  Carlo  of  relatively  large  size 
where  it  is  not  possible  to  visually  examine  each  case  separately.  The  exponential 
distribution  example  shows  that  the  estimated  density  support  is  close  to  the  real 
support.  It  also  indicates  that  the  estimated  density  is  not  quite  smooth.  The 
behvior  can  be  improved  by  a  larger  choice  of  the  h  parameter,  however  this  causes 
the  ISE  to  be  larger.  The  bump  near  x=5.5  is  due  to  the  existence  of  at  least  an 
observation  near  the  upper  tail  portion  of  the  support.  For  the  Cauchy  distribution 
a  constant  multiple  of  n“s  is  used  as  pointed  in  the  Monte  Carlo  experiment.  Both 
tails  are  rough,  however  the  middel  portion  of  the  distribution  is  reasonalbely  close. 
The  double  exponential  case  gives  a  fairly  close  fit  except  at  the  lower  tail  of  the 
distribution.  The  logistic  distribution,  for  this  case,  does  not  give  as  good  fit  as  for 
the  normal.  The  normal  distribution  example  shows  a  good  fit  at  both  tails  with  the 


66 


noparametric  distribution  skewed  to  the  right.  This  indicates  that  the  observations 
from  the  sample  selected  are  not  quite  symmetric  about  the  true  mean  and  hence  it 
shows  how  the  nonparametric  distribution  follows  the  sample  behavior. 


57 


V.  Parameter  Estimation 


Introduction 

Parameter  estimation  for  the  three  parameter  Welbull  distribution  is  discussed 
in  this  chapter.  While  the  problem  has  been  handled  in  different  ways,  the  method 
used  here  is  based  on  the  numerical  solution  of  the  log-likelihood  equations  using  t  he 
hybrid  method.  The  hybrid  method  is  an  iterative  method.  The  method  is  surveyed 
and  the  stopping  rule  is  stated.  The  results  from  this  chapter  will  be  compared  with 
the  results  from  the  next  chapter. 

Maximum  Likelihood  Estimation  For  The  Parameters  Of  The  Three  Pa¬ 
rameter  Weibull  Distribution 

The  likelihood  function  for  the  three  parameters  weibull  is  given  by: 
L(xj,---,xn,6,0,0)  =  f[/(x 

i=  l 


=  (^Tn 

i=i 


(x,-  —  sf~'  exp 


(-e-6 1  (x,-  -  if) 


Which  gives  the  following  set  of  equations  upon  the  differentiation  of  the  log 
likelihood  function  w.r.t  the  three  unknown  parameters. 


58 


i-n/3/e)  +  0e-i0+l)'£(xi-6f  =  0  (8U) 

1=1 


(7? //?)  —  nlnO  +  ^  /n  (x,  —  <5)  +  0  slnO  ^  (x,  —  8)^ 
i=l  1=1 


-r/3E  [(*«'  -6fln(Xi  -6)]  =  0 


1=1 


-  (0  -  1)  £  (x,-  -  8)~l  +  0e~^  (*i  -  S) 


P-i 


=  0 


(82) 


:=1 


The  solution  of  these  equations  gives  a  vector  0;  =  (8,0,$)  that  maximizes 
the  log  likelihood  function  (also,  maximizes  the  likelihood  function). 

The  first  equation  gives  the  parameter  9  as  a  function  of  8  and  $  in  the  form: 

0  =  0(6,0)  (8:1) 

while  the  other  two  equations  are  not  explicitly  solvable  for  0,8.  By  substituting  0 
from  the  first  equation  into  the  other  two  equations,  these  last  two  equations  become: 


59 


0  (8-1) 


1=1 


J2(x>  -  6f 


y:  (x{  —  6)^  In  (xj  —  S) 

■i=i 


-(1  -  fi)Yl(xx  -  6)  1  +  n(3 

i=i 


E  <=.  -  «5)fl 


p'.-fl''1 


0  (85) 


The  system  of  the  3  non  linear  equations  for  the  maximum  likelihood  in  0  = 
(6.0,3)  is  solved  using  a  numerical  technique.  The  method  is  known  as  the  hybrid 
method.  This  method  is  basically  an  iterative  method  based  on  Newton-Raphson 
method,  where  the  equations  have  the  form: 

L,(Q)  =  Li(6,O,0)  =  0  ,*  =  1,2,3  (86) 

where  the  vector  0  represents  the  triplet  of  the  Weibull  parameters  (location,  scale 
and  shape).  In  this  case  the  Newton-Raphson  solution  for  these  equations  takes  the 
form: 

Q(^)  ^  QW +^L'(Qk)]~1  L(Q(k))  ,k  =  0, 1, ...  (87) 

where  L'(Q )  denotes  the  Gateaux  derivative  of  L,  where  L  is  Gateaux  differentiable 
at  0  if  3  a  linear  operator  A  9: 


||  L(Q  +  th)  -  L(Q)  -  tAh 

lim - 


=  0  V/iG  1Z{3) 


(88) 


60 


This  method  has  a  quadratic  convergence  properties,  however  it  suffers  from 
the  pitfall  of  failure  to  converge  if  the  initial  guess  0^  is  far  away  from  the  solution 
05- 

Several  different  modifications  were  introduced  to  overcome  that  problem. 
Among  these  methods  are  the  norm  reducing  method  where  the  derivative  is  multi¬ 
plied  by  a  factor  such  that  the  norm  will  be  non-decreasing  as  the  iterations  progress. 
Another  method  is  to  ensure  that  the  derivative  is  non-singular  by  adding  a  constant 
to  its  diagonal  elements  such  that  the  new  matrix  is  non-singular  when  the  derivative' 
is  singular.  A  third  method  is  by  occasionally  computing  the  derivative.  A  more 
detailed  discussion  of  such  methods  is  due  to  Ortega  (1970). 

The  difficulty  of  such  basic  methods,  is  in  the  need  to  compute  3  components  of 
L  and  9  entries  of  U.  Several  other  modifications  are  introduced  by  Powell  (1970)  to 
alleviate  such  a  problem  by  avoiding  the  direct  computation  of  L'  through  replacing 
it  by  the  difference  approximations.  Harter  and  Moore  in  their  1965  paper  solved 
the  system  of  the  nonlinear  equations  for  joint  maximum  likelihood  estimation  from 
complete  and  censored  samples  of  the  three  parameter  Weibull  (  also  of  the  three 
parameter  Gamma).  The  proposed  iterative  procedure  was  applied  to  both  general 
case  as  well  as  cases  when  any  one  or  any  two  of  the  three  parameters  were  known. 
The  iterative  scheme  used  here  was  proposed  by  Powell  (1970)  where  the  derivative 
was  not  iust  scaled  by  a  small  factor  but  by  introducing  a  negative  multiple  of  the 
gradient  of  L(0)  such  that  the  direction  for  the  correction  in  the  different  iterations 


61 


will  be  sensible  as  the  Jacobian  is  almost  singular. 

The  method  can  be  applied  in  two  cases:  when  the  first  derivative  L'  is  given 
or  when  it  is  numerically  approximated.  Since  in  our  case,  the  functional  form  for 
the  derivative  is  not  complicated,  the  approach  when  the  Jacobian  is  given  is  chosen 
to  be  used. 

Methodology 

The  technique  is  basically  a  modification  of  Levenberg/Marquardt  idea  for  the 
classical  Newton-Raphson  iterative  scheme  for  the  solution  of  a  nonlinear  system  of 
equations  through  the  usage  of: 

(1)  A  negative  multiple  of  the  gradient  of  L(Q)  to  avoid  the  near  singularity 
of  the  Jacobian  matrix. 

(2)  A  flexible  choice  of  the  difference  between  0(fc+1)  and  Q(t)  in  each  step  is 
used  to  decrease  the  number  of  iterations  depending  on  the  increase  or  decrease  of 
1(0). 

The  running  time  of  the  algorithm  depends,  in  general,  on  the  number  of 
equations,  the  function  behavior  of  L(0),  the  initial  or  the  starting  point  0(o),  and 
the  accuracy  required  in  terms  of  the  step  difference  and  the  norm. 

An  accuracy  of  .01  was  used  for  the  absolute  difference  between  two  successive 
0's  while  the  Euclidean  norm  accuracy  was  relaxed  since  the  MISE  criteria  is  to 
be  used  latter  for  the  comparison  and  the  interest  was  in  the  convergence  of  the  0 


62 


parameter  mainly. 


The  algorithm  did  not  converge  in  a  few  cases  (24  cases)  which  were  excluded 
from  the  Monte  Carlo  results.  This  happened  because  the  method  was  searching 
for  a  zero  of  the  system  of  nonlinear  equations  L(Q)= 0  by  minimizing  the  quadratic- 
form  Lt(0)  L(Q)  or  the  sum  of  squares  of  the  maximum  likelihood  ecjuations.  In 
which  case  the  minimum  would  not  give  a  zero  of  the  system. 

The  initial  guess,  is  chosen  to  be  the  same  for  all  of  the  different  Monte  Carlo 
samples  of  size  1UUU. 

It  was  proved  by  Powell  in  1970  that  the  iterations  stops  due  to  one  of  the  men¬ 
tioned  stopping  rules  or  otherwise  the  solution  converges  to  a  solution  0*  providing 
that  the  Jacobian  matrices  are  bounded  and  L(Q°)  is  finite.  Powell  also  proved  that 
the  algorithm  will  stop  after  a  finite  number  of  iterations  by  one  of  the  two  stopping 
rules  providing  that  £,(0)  is  of  continuous,  bounded  first  derivatives. 

Stopping  Criterion 

In  addition,  the  technique  introduces  two  stopping  criterion: 

First  is  step  length  in  two  successive  iterations  which  is  taken  as  .01. 

Second  is  the  maximum  number  of  iterations  which  is  taken  as  1000. 


63 


Results 


The  results  from  the  previous  application  are  shown  on  tables  5  to  table  17  at 
the  end  of  chapter  IV  where  cases  of  shape  parameter  1,  2,  3  and  4  for  sample  sizes 
1(J,  20,  and  30  with  location  10,0  and  scale  5.0  are  given.  The  tables  show  the  sample 
used  for  each  case.  The  integrated  square  error  (ISE)  and  the  function  norm  were* 
also  given  as  measures  for  the  closeness  and  accuracy  of  the  nonlinear  solution.  The 
mean  integrated  square  error  from  the  Monte  Carlo  experiment  are  shown  at  the  end 
of  the  next  chapter  where  it  will  be  compared  with  the  results  from  the  minimum 
distance  estimation  technique. 


64 


VI.  Minimum  Distance  Estimation 


Introduction 

Minimum  distance  estimation  (MDE)  was  proposed  by  Wolfowitz  (Wolfowitz. 
1950).  Parr  and  Schucany  demonstrated  the  robustness  of  MDE  in  predicting  the 
location  of  symmetric  distributions  (Parr  and  Schucany,  1980).  Hobbs,  Moore,  and 
James  (Hobbs  and  others.  1984)  used  MDE  to  find  the  location  of  the  gamma  distri¬ 
bution.  Similarly,  Hobbs,  Moore,  and  Miller  (Hobbs  and  others,  1985)  used  MDE  to 
estimate  the  location  of  the  Weibull.  In  recent  research  (Gallagher  and  Moore,  1989) 
the  previous  work  was  extended  by  applying  MDE  to  all  the  distribution  parameters 
and  by  testing  the  robustness  of  MDE. 

MDE  selects  as  estimates  those  p.d.f  parameters  which  minimize  the  discrep¬ 
ancy  between  the  sample  data  and  the  estimated  distribution.  The  distance  mea¬ 
sures,  which  are  minimized  are  ”  Goodness  of  fit  statistics”  (g.o.f). 

The  MDE  has  the  following  characterization  and  properties: 

1.  Not  susceptible  to  outliers  (Parr  and  Schucany,  1980). 

2.  Statistically  consistent  (Wolfowitz,  1957). 

3.  Easily  applied  to  all  the  parameters  (Parr  and  Schucany,  1980). 

A  series  of  logical  candidates  for  the  distance  estimation  task  is  studied  by 
Fuchs  (Fuchs,  1984). 


65 


This  series  includes: 


-  General  exponential  power  distribution. 

-  Generalized  beta  distribution. 

-  Generalized  gamma  distribution. 

-  Generalized  t  distribution. 

-  R-S  distribution:  which  was  originally  developed  to  generate  random  variates 
(Ramberg  and  Schmeister  1979).  It  is  a  generalization  of  Tukey’s  lambda  function 
and  can  be  used  to  model  a  wide  variety  of  data.. 

The  probability  density  function  of  the  R-S  distribution  is  given  in  terms  of 
the  percentile  function,  R(p) 

f{x\p,a,b,  c,  d)  =  f{R{p))  =  (cpc_1  +  d(l  -  p)d~l)/b  (89) 

R(p)  =  a+  {pc  -  (1  -p)d)/b  (90) 

where  —  oo  <  a  <  x  <  oo  ;  -oc  <  a,  6,  c,  d  <  oo  ,  0  <  p  <  1 

-  Generalized  life  model:  developed  by  Moore  and  Bilikan,  which  includes  the 
Weibull  and  the  Raleigh  distribution  as  a  special  case.  The  p.d.f  is  given  by: 

/  (x:a.b,g(x))  =  bg'{x)  ( g[x))b~l  exp  (-  {g{x))b  /a)  /a  (91) 

where  g(x)  €  Rl  ,  limr_0+  g(x)  =  0  ,  lim^.^ ^  g(x)  =  oo 


66 


and  g(x)  is  strictly  increasing  ,0  <  x  ,  a,l>  <  oo 


Minimum  Distance  Estimation  For  The  Three  Parameter  Weibull  Distri¬ 
bution 

The  3-parameter  Weibull  density  function  is  given  by: 

.  ( x  —  8\^  1  ( x  — 

/(*)  =  ~Q  (' — 0—  J  exP  -  (  — y-  I  ,6<x,0jj>  0  (02) 

with  expected  value 

E(x)  =  6  +  ev  {£±1) 

and  with  variance 


Ill  is  used  with  a  Gaussian  kernel  which  is  defined  as: 


ft  \  ^  v-'  r  ^ x  ' 

/<U  =  ^Xh 


i=i 


1  "  1  (X  -  AV 2 

—  >  -7=  exp 

nh  y/2ir 


The  C.D.F  of  this  kernel  density  F(x)  is  given  as: 


*>  -  /  =  £3?“^* 


/  S  2 


dx 


If  /  1 

“  n  T  J  elp 


X  -  X , 


dx 


=  At 


n  —  V  h 


where  <F(t)  denotes  the  C.D.F  for  a  standard  normal  random  variable. 


The  Cramer  von  Mises  statistic  W2  is  used.  This  g.o.f.  statistic  is 


W2  =  n  J  [F(.r)  —  F0(x)\2  dF0(x) 


or  the  computational  formula: 


»-2  =  E 

j=i 


12  n 


G8 


(ttU) 

(«J7) 

m 

(09) 

(100) 

defined  as: 

(101) 


+ 


(102) 


As  it  was  noted  early,  the  optimal  value  of  the  window  width  h  (  in  the  M1SE 
sense)  depends  on  the  choice  of  the  kernel  K,  the  underlying  unknown  density  f(x) 
and  the  sample  size  i.e 


Kvt  =  /i(A')./2(/(*))./3(«)  (KH) 

A  reasonable  approximation  for  this  optimal  value  for  a  normal  sample  is  h  = 
kn~z  where  k  is  a  real  constant  (see  equation  38).  Although  this  approximation 
simplifies  the  optimal  expression  for  the  window  width  and  works  fine  with  the 
normal  distribution,  it  is  not  as  good  for  other  distributions.  This  leads  to  the  idea 
of  introducing  the  underlying  density  in  another  approximating  expression  for  that 
h.  The  explicit  expression  for  hopt  is  given  as: 

h0pt  =  mj2/5  {/  I\2(t)dt^  {Jf-ixfdx}  n~l,b  (104) 


where: 

r??2  denotes  the  kernel  second  moment. 
In  case  of  a  Gaussian  kernel: 


m2  =  J  t2 I\(t)  dt 
=  v(t) 

=  1  (105) 


G9 


also,  /  K2(t)dt  is  simply  equal  to  ^77 


Now, let 


= cip  (y 


(IOC) 


Hence,  /(.r)  can  be  written  as 


J=(l)  “  v^fn/.S5'(l) 


(108) 


(100) 


J'(x)  =  exJ-l{t-h 


=  5t(a-)/i(.T) 


(110) 


«*>  “  -*J 


(mi 


/'(*)  =  -7=— X]  5,(.r)/1(.r 

v/2-77 /( 


70 


i"(x)  ~  s/2inh]^ 


l](x)Si(x)  - 


Si(x) 


IS 


(113) 


■^,2(X)  “  2tT7J2/J2  {g  ^ 


W  '  ¥  J 


(114) 


since 


-|2 


X]  Cfc 

A-=l  J 


=  E  c.c 

*.j=i 


(115) 


thus 


/"Jw  =  I5''1*5''1)  [«*>’  -  £]  [«*>*  -  h 

t,J  —  I 


(116) 


Si(x)Sj(x)  =  exp 


1  /x  -  X.  y  1  (x  -  Xj 


2  \  h  J  2  V  /i 


'  \  2 


=  exp 


(2a;2  -  -x  {Xi  +  *^j)  +  +  -^7) 


=  exp 


4/i2 


=  exp 


16/i2 


{Xi  ~  *i? 


exp 


1  (  Xj  +  Xj\2' 

4/i2  V  2  J 


71 


=  exp  --L^(Xi  -  Xjf  gtJ  (x)  ,2\fzh 


(117) 


where  g,:{x)  is  a  normal  density  distribution  with  mean  ,  and  variance  2 h7 

Now,  let  lt{x),  lj(x)  be  written  as  /,,  /;  for  the  simplicity  of  the  notations. 


«*)  -  p)  (?<*)  -  b)  -  W+h-nP 


X4  -  2x*{Xi  +  Xj)  +  x7{Xf  +  A2  +  4XiXj)  -  2xXiXj{X{  +  Xj  ] 

h 8 


2sa-2*(A,-  +  Ai)  +  A7  +  A?  1  A'?  A'? 

A6  +  h4  +  k 8 


x4*  2.r3(A\  +  A'j )  2  -^7  “1"  4AfAj  2 

X®  ft*  +  a’  }?  +  ¥ 


XiXj{Xi  +  Xj)  Xf  +  X]]  XfX]  X?  +  Xj  1 
h 8  /i6  +  h8  +  /i6  +  h4 


/"*(*) 


_  2xa(A,-  +  A,-)  2  [A,2  +  A2 +  4  A, A,  _2_ 

v/Fn2  ^  j(/i9  /i9  +  /i9  +  h‘ 


,  [A.A^A.-  +  A,)  A,2  +  A2]  A2  A' 2  A2  +  A2  l 

—  2x  - - - —  4 - i-  j - - - 3-  -1 - : - —  -1 - 

/i9  h7  /i9  /ir  /r5 


72 


X 


-(Xj-X,)2 

-  Y.e-^{ 


—(X  —X  )2 

e"  i'e^;  \E{x4)  2E(x3){X,  +  Xj) 

xf-nri1  \  h 9  /t9 


Hence 


+  £(x2) 


JW+M,  2 


’ - 4.  — 

/i9  h7 


„Flr)\X.X,(X,  +  Xj)  ,  A?  +  A/l  XfX]  X?  +  X]  1 

h*  +  v  +  ~1F~ +  w  + 1? 


E(x')  = 


a.  +  a-a"  ,  ,^A'.  +  A'A;,; 


hz  +  12/i  (118) 


£(x3)  = 


Xi  +  Xj  IfXi  +  Xj 


[(* 


(119) 


£(x2)  = 


Xi  +  Xj'7 


(120) 


E(x)  = 


A,-  +  Xj 


J  in[x). 


-(A, -A,)2 


73 


h? 


(X,  +  Xj)  ( X,  +  X , 


-  X  2 


+  6/i5 


+ 


+  2  h'4 


X f  +  X ?  +  4  AW,  2 

— - 2 - -  -j-  — 

/i9  h7 


-(Xi  +  Xi) 


XjXjiXj  +  Xj)  X[  +  X[ 
h9  h7 


XfXj  X?  +  Xj  1  ' 

+  — - — J-  -t - 5 - L  H - 

h9  h7  h 5 


(122) 


On  substituting  this  previous  integral  for  the  integral  of  the  density  squared  in  the 
expression  for  the  optimal  h,  h  will  be  possibly  written  as: 


hopt  =  T  (h) 


(123) 


or  equivalently  as: 

Yj(/i)  =  hopt  —  T(/i)  =  0  (124) 

which  can  be  solved  by  one  of  the  generalization  methods  for  the  solution  of  one 
equation  in  one  unknown,  such  as  Newton’s  method,  secant  method,  Steffenson’s 
method  or  any  of  their  variations. 

The  Newton’s  method  has  the  form: 

hk+i  =  hk  -  [t;  (A*)]-'  T,  ( hk )  (125) 


74 


which  gives  a  quadratic  convergence  i.e 


|| hk+l  -  /i’||  <  c\\hk  -  /i'||  (126) 

for  a  sufficiently  close  hk  ,  h* 

An  alternative  for  computing  the  window  width  which  is  more  efficient  compu¬ 
tationally  and  gives  a  good  improvement  in  this  application  is  to  choose  an  empirical 
h  which  equals  $n~1^5  where  5  represents  the  sample  standard  deviation.  This  sug¬ 
gested  h  showed  MISE  which  is  close  enough  to  the  optimal  theoretical  and  since  it 
was  simple,  without  a  need  to  extensive  computations  and  face  degeneracy  sometimes 
compared  to  the  iterative  approach. 

Methodology 

The  Monte  Carlo  procedure  for  this  application  can  be  described  in  the  follow¬ 
ing  three  steps: 

Step  I 

-  Different  samples  from  Weibull  with  a  given  location,  scale,  and  shape  for 
different  sample  sizes  are  generated.  The  uniform  random  number  is  generated  using 
the  RNUN  routine  from  the  IMSL. 

-  The  Weibull  deviates  are  generated  using  the  inverse  C’.D.F  technique. 

Sten  II 


75 


-  The  MLE  estimators  for  the  3-parameters  are  computed  as  discussed  earlier. 

-  The  CvM  statistic  is  computed  for  the  estimated  density  with  MLE  for  the 
parameters. 

Step  III 

-  Minimizing  the  CvM  statistic  with  Q  as  the  decision  vector  and  with  the 
given  constraints  on  the  values  of  the  parameters. 

-  The  non-linear  program  is  solved  using  quasi  Newton  method. 

-  The  new  parameter  estimates  are  compared  with  those  of  MLE,  Using  the 
ISE  as  a  measure  for  the  comparison. 

Results 

Together  with  the  results  from  the  previous  chapter,  the  end  result  for  this 
application  is  shown  in  tables  6.  The  table  shows  that  both  the  MLE  method  and 
the  new  technique  are  statistically  the  same  for  shape  parameter  1.  However  the 
new  technique  shows  a  significant  improvement  over  the  MLE  method  for  shape  pa¬ 
rameters  2,  3,  and  4.  For  shape  parameter  2  the  new  method  gives  an  MISE  which 
is  5.3  times  smaller  than  that  of  the  MLE,  while  in  the  case  of  shape  parameter  3 
the  MISE  from  the  new  technique  is  about  6  times  smaller  than  that  of  the  MLE. 
For  shape  parameter  4  a  tremendous  improvement  is  obtained,  where  the  ratio  be¬ 
tween  the  MISE  for  MLE  to  that  of  the  new  tecnique  is  15.9,  which  shows  how  big 
the  improvement  is  due  to  the  new  technique.  Table  7  to  table  18  give  examples 


76 


from  each  of  the  four  shape  parameter  values  chosen  for  the  Monte  Carlo.  These 
tables  give  a  case  for  each  value  of  the  shape  parameter  1,  2,  3,  and  4  for  sample 
sizes  10,  20,  and  30  with  location  10.0  and  scale  5.0.  The  same  sample  is  used  to 
iteratively  solve  the  maximum  likelihood  nonlinear  equations.  The  integrated  square 
error  (ISE),  the  value  of  the  window  width  used,  and  the  optimal  value  for  the  CvM 
statistic  based  on  using  the  nonparametric  density  estimation  approach  are  given. 
The  graphs  for  these  cases  are  given  in  figures  1  to  24  while  the  next  table  shows  the 
resulting  MISE  together  with  its  standard  deviation  for  sample  size  20  for  the  differ¬ 
ent  parameter  values  for  both  the  new  proposed  estimation  technique  concurrently 
with  the  modified  nonlinear  method  for  solving  the  ML  equations. 

Table  6.  Results  from  M.C  size  1000  for  sample  size  20 


Weibull(loc.,  sea.,  sha.) 

MISEcvM 

MISEmle 

W(  10,5,1) 

.13209 

(.1723J,) 

.13678 

( .17820 ) 

W(10,5,2) 

.04970 

(.05061) 

.26364 

(.19757) 

W(  10,5,3) 

.03378 

(.03385) 

.20255 

(.17740) 

W(10,5,4) 

.02575 

(.02551) 

.40923 

(.49626) 

Table  7  to  table  18  show  that  the  choice  of  the  h  parameter  varies  from  sample 
to  sample  and  from  one  shape  parameter  to  another.  The  tables  also  show  variations 


77 


in  the  value  of  MISE  over  different  shape  parameters  for  the  Wei  bull  density.  These 
variations  in  h  value  together  with  the  variations  in  the  MISE  indicate  that  that 
the  method  used  is  an  adaptive  one  in  the  sense  that  the  choice  of  the  parameter  h 
which  is  data  dependent  varies  with  the  variation  of  the  distribution  shape  and  the 
particular  sample. 

Thus  the  final  conclusion  is  the  minimum  distance  estimation  method  using  the 
CvM  statistic  as  a  measure  of  the  difference  between  a  nonparametric  estimator  based 
on  a  suggested  window  width  and  a  parametric  density  with  unknown  parameters 
gives  in  general  a  much  smaller  MISE  value  than  the  maximum  likelihood  method. 


78 


Table  7.  Weibull  Sample  (  Shape  =  1.0  and  Sample  Size  =  10) 


TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  1.0 
SAMPLE  SIZE  =  10 

Weibull  Data  Values 


10.009320 

10.226890 

10.798260 

10.866060 

11.054560 

11.788680 

14.245620 

14.910420 

17.955210 

24.277580 


TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

10.0080 

7.4580 

SCALE 

5.0000 

3.40U0 

6.8850 

SHAPE 

1.0000 

0.9500 

1.2990 

ISE 

0.1398 

0.1253 

Function  Norm 

3011.9329 

Window  Width 

2.8563 

Optimal  CvM 

0.0090 

79 


Parameter  estimation  for  the  three  parameter 
Weibull  density  W(10,5,l) 

Sample  size  10 

using  nonparametric  modifeid  MDE  technique 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(  10,5,1) 

Sample  size  10 

using  nonparametric  modifeid  MDE  technique 


81 


Table  8.  Weibull  Sample  (  Shape  —  2  0  and  Sample  Size  =  10) 


TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  2.0 
SAMPLE  SIZE  =  10 

Weibull  Data  Values 


10.905100 

12.259120 

13.099780 

14.168940 

14.219130 

15.972290 

16.971910 

16.296600 

16.373850 

16.426060 


TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

10.9041 

0.4785 

SCALE 

5.0000 

2.9963 

14.9260 

SHAPE 

2.0000 

0.9469 

7.1468 

ISE 

0.1231 

0.0221 

Function  Norm 

3234.4231 

Window  Width 

1.2068 

Optimal  CvM 

0.0085 

82 


Parameter  estimation  for  the  three  parameter 
Weibull  density  \V(10,5,2) 

Sample  size  10 

using  nonparametric  modifeid  MDE  technique 


Figure  15.  p.d.f  for  W(  10,5,2)  with  N=10 
do 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(  10,5,2) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Table  9.  Weibull  Sample  (  Shape  =  3.0  and  Sample  Size  =  10) 
TRUE  PARAMETERS  ARE 


Location  =  10.0 

Scale  =  5.0 

Shape  =  3.0 

SAMPLE  SIZE  =  10 

Weibull  Data  Values 


11.783980 

12.080590 

12.377060 

13.530170 

13.606270 

13.616520 

13.776460 

13.902000 

15.878190 

16.802490 


TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

1.7830 

10.6560 

SCALE 

5.0000 

4.8683 

3.5345 

SHAPE 

3.0000 

1.1199 

1.7830 

ISE 

0.3463 

0.1040 

Function  Norm 

13781.3096 

Window  Width 

0.9975 

Optimal  CvM 

0.0086 

85 


Parameter  estimation  for  the  three  parameter 
Weibull  density  \V(  10,5,3) 

Sample  size  10 

using  nonparametric  modifeid  MDE  teclinique 


Parameter  estimation  for  the  three  parameter 
Wei  bull  C.D.F  W(  10,5,3) 

Sample  size  10 

using  nonparametric  modifeid  MDE  technique 


Figure  18.  C.D.F.  for  W(10,5,3)  with  N  =  10 


87 


Table  10.  Weibull  Sample  (  Shape  =  4.0  and  Sample  Size  =  10) 


TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =4.0 
SAMPLE  SIZE  =  10 

Weibull  Data  Values 


11.963030 

12.860350 

13.219490 

14.176460 

14.512920 

14.892670 

14.919950 

15.371890 

15.641700 

16.557470 


TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

11.9620 

6.1854 

SCALE 

5.0000 

1.4014 

8.8608 

SHAPE 

4.0000 

1.6940 

5.8242 

ISE 

0.5027 

0.0134 

Function  Norm 

478115.3125 

Window  Width 

0.8778 

Optimal  CvM 

0.0085 

88 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(  10,5,4) 

Sample  size  10 

using  nonparametric  modifeid  MDE  technique 


Figure  20,  C.D.F.  for  W(10,5,4)  with  N=10 


90 


Table  11.  Weibull  Sample  (  Shape  =  1.0  and  Sample  Size  —  20) 

TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  1.0 
SAMPLE  SIZE  =  20 

Weibull  Data  Values 


10.058190 

12.376410 

10.227110 

12.453510 

10.360260 

12.490680 

10.423800 

13.050770 

10.537260 

14.149840 

11.759740 

14.439350 

11.876000 

17.355110 

11.892060 

18.124390 

11.906630 

18.503469 

12.154340 

22.591120 

TRUE 

MLE 

MDCVM 

LOCATION 

10.00:0 

10.0572 

8.7660 

SCALE 

5.0000 

3.0587 

5.0997 

SHAPE 

1.0000 

0.9860 

1.2873 

ISE 

0.2165 

0.1183 

Function  Norm 

508.4898 

Window  Width 

1.8328 

Optimal  CvM 

0.0058 

91 


92 


Parameter  estimation  for  the  three  parameter 
Wei  bull  C.D.F  W(  10,5,1) 

Sample  size  20 

using  nonparametric  morlifeid  MDE  technique 


Table  12.  Weibull  Sample  (  Shape  =  2.0  and  Sample  Size  =  20) 
TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  2.0 
SAMPLE  SIZE  =  20 

Weibull  Data  Values 


13.447030 
13.502510 
13.528930 
13.905620 
14.555130 
14.711340 
16 . 06 1280 
16 . 373541 
16.520531 
17.934460 


TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

10.5384 

9.0263 

SCALE 

5.0000 

2.1504 

5.2000 

SHAPE 

2.0000 

1.0423 

2.2099 

ISE 

0.4962 

0 . 0737 

Function  Norm 

1238.4871 

Window  Width 

1.0842 

Optimal  CvM 

0.0050 

10.539380 

11.065610 

11.342130 

11.455670 

11.638990 

12.966260 

13.062680 

13.075760 

13.087580 

13.282030 


94 


Parameter  estimation  for  the  three  parameter 
Weibull  density  Vv'(  10,5,2) 

Sample  size  20 

using  nonparametric  modifeid  MDE  technique 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(  10,5,2) 

Sample  size  20 

using  nonparametric  modifeid  MDE  technique 


10  15  20  25  :U) 


Figure  24.  C.D.F.  for  W(  10,5,2)  with  N=20 


96 


Table  13.  Weibull  Sample  (  Shape  =  3.0  and  Sample  Size  =  20) 


TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  3.0 
SAMPLE  SIZE  =  20 


Weibull  Data 

Values 

11.133070 

13.902000 

11.783980 

13.943750 

12.080590 

13.963560 

12.196340 

14.240820 

12.377060 

14.698840 

13.530170 

14.805660 

13.606270 

15.686470 

13.616520 

15.878190 

13.625780 

15.968230 

13.776460 

16.802490 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

11.1321 

8.9949 

SCALE 

5.0000 

3.0818 

5.4584 

SHAPE 

3.0000 

1.1842 

3.2065 

ISE 

0.1308 

0.0552 

Function  Norm 

31529, 

.4902 

Window  Width  0.8209 

Optimal  CvM  0.0047 


97 


Parameter  estimation  for  the  three  parameter 
Weibull  density  W(  10,5,3) 

Sample  size  20 

using  nonparametric  modifeid  MDE  technique 


Figure  25.  p.d.f  for  W(10,5,3)  with  N=20 


98 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(10,5,3) 

Sample  size  20 

using  nonparametric  modifeid  MDE  technique 


Figure  26.  C.D.F.  for  W(10,5,3)  with  N=20 


99 


Table  14.  Weibull  Sample  (  Shape  =  4.0  and  Sample  Size  =  20) 
TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =4.0 
SAMPLE  SIZE  =  20 

Weibull  Data  Values 


11.963030 

14.216850 

12.212220 

14.512920 

12.269390 

14.717380 

12.331440 

14.892670 

12.860350 

14.919950 

13.062320 

15.143880 

13.219490 

15.371890 

13.540850 

15.641700 

14.020930 

16.008890 

14.176460 

16.557470 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

11.9620 

8.0986 

SCALE 

5.0000 

3.4861 

6.5663 

SHAPE 

4.0000 

1.0087 

4.1925 

ISE 

0.1814 

0.0497 

Function  Norm 

3.8416 

Window  Width 

0.7450 

Optimal  CvM 

0.0051 

100 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(10,5,4) 

Sample  size  20 

using  nonparametric  modifeid  MDE  technique 


Table  15.  Weilnill  Sample  (  Shape  =  1.0  and  Sample  Size  =  30) 
TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  1.0 
SAMPLE  SIZE  =  30 


Weibull  Data  Values 


10.058190 

11.892060 

14 . 149840 

10.227110 

11.906630 

14.439350 

10.360260 

11.930340 

15.096300 

10.423800 

12.154340 

17.355110 

10.537260 

12.376410 

18.124390 

10.761180 

12.453510 

18.503469 

11.346590 

12.490680 

19.405500 

11.719840 

12.508960 

19.906870 

11.759740 

13.050770 

22.591120 

11.876000 

13.821920 

35.215561 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

10.0572 

7.8721 

SCALE 

5.0000 

3.0323 

6.8966 

SHAPE 

1.0000 

1.0012 

1.3983 

ISE 

0.2278 

0.0528 

Function  Norm 

540.3060 

Window  Width 

2.5974 

Optimal  CvM 

0.0040 

103 


Parameter  estimation  for  the  three  parameter 
Wei  bull  density  W(  10,5,1) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Figure  29.  p.d.f  for  W(10,5,l)  with  N=30 


104 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  \V(  10,5,1) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Figure  30.  C.D.F.  for  W(10,5,l)  with  N=30 


105 


Table  1  fi.  Wei  bull  Sample  (  Shape  =  2.0  and  Sample  Size  =  30) 
TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  2.0 
SAMPLE  SIZE  =  30 

Weibull  Data  Values 


10.539380 

13.075760 

14.555 130 

11 .065610 

13.087580 

14.711340 

11.342130 

13.106710 

15.047920 

11.455670 

13.282030 

16.064280 

11.638990 

13.447030 

16.373541 

1 1 . 950870 

13.502510 

16.520531 

12.594800 

13.528930 

16.857660 

12.932440 

13.541860 

17.038059 

12.966260 

13.905620 

17.934460 

13.062680 

14.371460 

21.228439 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

10.5384 

9.5878 

SCALE 

5.0000 

2.1066 

4.9859 

SHAPE 

2.0000 

1.0111 

1.9222 

ISE 

0.5117 

0.0248 

Function  Norm 

18.6620 

Window  Width 

1 . 1761 

Optimal  CvM 

0.0045 

106 


Parameter  estimation  for  the  three  parameter 
Wei  hull  C.D.F  W(  10,5,2) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Figure  32.  C.D.F.  for  W(10,5,2)  with  N=30 


108 


Talde  17.  Weilmll  Sample*  (  Shape  =  3.0  and  Sample  Size  =  30) 


TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  3.0 
SAMPLE  SIZE  =  30 


Weibull  Data  Values 


11.133070 

13.616520 

14.698840 

11.783980 

13.625780 

14.805660 

12.080590 

13.640750 

15.031900 

12.196340 

13.776460 

15.686470 

12.377060 

13.902000 

15.878190 

12.669780 

13.943750 

15.968230 

13.228930 

13.963560 

16.172211 

13.503290 

13.973240 

16.279989 

13.530170 

14.240820 

16.802490 

13.606270 

14.571660 

18.574381 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

11.1321 

11 . 1331 

SCALE 

5.0000 

2.6142 

3.5205 

SHAPE 

3.0000 

1.1138 

1.8773 

ISE 

0.2291 

0.0182 

Function  Norm 

10696.4004 

Window  Width 

0.8241 

Optimal  CvM 

0.0091 

109 


Parameter  estimation  for  the  three  parameter 
Weibull  density  W(  10,5,3) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Figure  33.  p.d.f  for  W(10,5,3)  with  N=30 


Table  18.  Wei  bull  Sample  (  Shape  =  4.0  and  Sample  Size  =  30) 
TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =4.0 
SAMPLE  SIZE  =  30 

Weibull  Data  Values 


11.963030 

13.634270 

15.143880 

12.212220 

14.020930 

15.145930 

12.269390 

14.054020 

15.312240 

12.331440 

14.176460 

15.371890 

12.840970 

14.216850 

15.398440 

12.860350 

14.512920 

15.618870 

13.062320 

14.717380 

15.641700 

13.130750 

14.755560 

16.008890 

13.219490 

14.892670 

16.538071 

13.540850 

14.919950 

16.557470 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

11.9620 

7.7954 

SCALE 

5.0000 

3.3463 

7.0515 

SHAPE 

4.0000 

1.0086 

4.7788 

ISE 

0.1633 

0.0195 

Function  Norm 

90.9906 

Window  Width 

0.6636 

Optimal  CvM 

0.0048 

112 


Parameter  estimation  for  the  three  parameter 
Weibull  density  W(  10,5,4) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Parameter  estimation  for  the  three  parameter 
Wei  bull  C.D.F  \V(10,5,4) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Figure  36.  C.D.F  for  W(10,5,4)  with  N=30 


114 


VII.  GOODNESS  OF  FIT  APPLICATION 


Introduction 

When  a  sample  is  drawn  from  a  certain  distribution  it  is  hoped  that  its  empir 
ical  distribution  function  (E.D.F)  will  resemble  the  population  cummulative  distri¬ 
bution  function  (C’.D.F).  The  resemblance,  here  should  take  a  quantitative  moaning 
This  is  done  by  measuring  the  closeness  or  the  distance  of  the  E.D.F  to  the  ('.1)1 
Thus,  if  Fn(x)  represents  the  E.D.F  and  Fa(x)  represents  the  true  theoretical  C'.D.F 
then  many  different  ways  of  considering  the  distances  between  Fn(x)  and  Fa(.r)  sug¬ 
gest  a  wide  class  of  goodness  of  fit  statistics.  Cini’s  index  of  dissimilarity  as  the 
integrated  absolute  difference  between  both  C.D.F’s  is  among  these  fitting  criterion 
between  Fn(x)  and  F0(x).  This  index  is  also  modified  by  weighting  the  integral 
by  F'0{x).  Cramer  and  von  Mises  introduced  the  known  Cramer-von  Mises  statist  ic 
(CvM)  through  weighting  the  integral  of  the  squared  difference  between  the  C.D.F's 
by  F'o(x).  Anderson  and  Darling  introduced  the  Anderson  Darling  statistic  (AD 
by  weighting  Cramer-von  Mises  integral  by  f  (x)p  Watson.  Kolmogrov  and 

Smirnov,  and  Kuiper  are  also  other  examples  of  the  goodness  of  fit  statistics.  Such 
statistics  are  surveyed  by  Stephens  (1974).  The  computational  formulae  of  some  of 
these  statistics  are  given  below: 


115 


K-S  Statistic  I\ 


K  =  max  [D+  ,  D  ) 


( 127 


where, 


D+  =  sup(i/n  —  Fi) 


( 1  ■>*  i 


D~ 


sup 


Ft  - 


1  <  i  <  n 


( 1 25)) 


and  Fi  is  F0  at  the  ith  order  statistic. 

Anderson  -  Darling  A2 


Tl 

A2  =  —  n  —  l/n  ^2  (2 i  —  1)  [InF,  +  In  (1  —  Fn+i_,)] 

1=1 


Cramer  von  Mises  statistic  lh'2 


( i  d() ! 


1  =  1 


("l  —  1) 
2  n 


+ 


1 

12  n 


(ldli 


Kuiper  statistic  V 


V  =  D+  +  D~ 


( I  :*■_»  i 


116 


The  Watson  statistic  U2 


U2  =  W2  —  n(F  —  ,5)2 


where 


11  F 

f  =  Y  — 
hi  » 


li:» 


Modified  Goodness  Of  Fit  Test 

A  goodness  of  fit  test  based  on  the  E.D.F.,  where  the  parameters  are  estimated 
is  called  a  modified  goodness  of  fit  test. 

Basic  Characterization 

1)  If  the  tables  for  completely  specified  null  hypothesis  are  used  while  i  lie 
parameters  are  estimated,  this  makes  the  actual  a  error  much  smaller  and  biases  the 
test  towards  accepting  H0  even  without  testing. 


2)  When  the  parameters  are  estimated,  the  null  distribution  of  the  test  si  at  ist  h 
and  hence  the  percentage  points  will  not  depend  on  the  location  or  scale  parameter. 
However,  one  must  use  the  same  estimators  as  were  used  in  the  construction  of  the 


tables. 


Now,  let 


F  a  family  of  C.D.F’s  with  locati 


ion  and  scale  parameter  c.  (I  respee 


tively,  and  F0  the  C.D.F  when  inserting  estimators  for  c,  0  under  H 


'  li( )  or  simply 


117 


denoted  as 

Although  the  distribution  of  the  test  statistic  and  its  percentage  points  do  m»i 
depend  on  c  and  0 ,  one  should  use  tables  with  the  same  estimators  as  those  used  to 
construct  the  tables. 

A  modified  K-S  goodness  of  fit  test  by  Monte  Carlo  simulation  lot  the  normal 
distribution  with  /i,<r2  (Lilliefors,  1966)  and  for  the  exponential  distribution  with 
unknown  mean  (Lilliefors,  1967)  were  introduced  with  a  study  of  the  power  of  the 
test  which  showed  that  the  modified  K-S  test  had  higher  power  than  \2-test  for  the 
normal  case. 

Woodruff  et  al.  (1983)  and  Bush  et  al.  (1983)  derived  tables  for  modified  K-S. 
CAM  and  AD  tests  for  the  Weibull  distribution  with  shape  parameter  1  (two  param¬ 
eter  negative  exponential).  Their  study  showed  that  the  CvM  test  had  the  highest 
power  for  most  of  the  alternative  distributions  studied  when  the  null  hypotheses  was 
the  two  parameter  negative  exponential.  The}1,  in  addition,  studied  Weibull  with 
different  shape  parameters  and  showed  that  the  AD  statistic  was  the  most  powerful 
when  the  null  distribution  was  Weibull  with  shape  parameter  3.5.  A  relationship 
between  the  critical  values  and  the  inverse  of  the  shape  parameter  was  presented  for 
the  range  of  the  shape  parameters  studied. 

As  for  the  two  parameter  Weibull.  a  BLUE  and  BLIE  (best  linear  invariant 
estimator)  for  the  unknown  parameters  is  found  in  Mann  (1968)  using  the  fact  that 
two  parameter  Weibull  is  transformed  into  extreme  value  by  a  logarithmic  transfot  - 


118 


mation.  She  also  derived  a  goodness  of  fit  test  for  the  extreme  value  distribution  ul 
smallest  values. 

Tables  of  critical  values  for  the  modified  K-S,  CvM  and  AD  statistics  using  M.< 
techniques  for  the  extreme  value  distribution  where  the  MLE  for  the  parameters  i- 
used  are  derived  in  a  paper  by  Littelle  et.  al.  in  1979. 

Tables  for  the  percentage  points  for  the  modified  K-S,  AD  and  CvM  statistic' 
for  the  gamma  distribution  are  derived  in  Woodruff  et  al.  (1984). 

In  addition,  similar  tables  are  derived  for  1  he  critical  values  for  the  modified 
K-S,  AD  and  CvM  goodness  of  fit  for  the  logistic  distribution  with  unknown  shape 
and  location  parameter  using  MLE  to  estimate  the  parameters  (Woodruff  et  ah. 
1986). 

Porter  and  Moore  derived  tables  of  critical  values  for  the  modified  K-S.  Al) 
and  CvM  goodness  of  fit  statistics  for  the  Pareto  distribution  with  unknown  shape 
parameter.  The  powers  were  shown  for  eight  alternative  distribution.  In  addition 
they  derived  a  functional  relationship  between  the  shape  parameters  and  the  critical 
values  of  the  test. 

Yen  and  Moore  derived  tables  of  critical  values  for  the  modified  Al)  and  CvM 
goodness  of  fit  statistics  for  the  Laplace  distribution.  The  critical  values  were  tabled 
for  sample  sizes  n=5(5)50  and  significant  levels  a  —  .1..2..5.  The  AD  test  generally 
yielded  higher  power  than  the  CvM  test. 

Harter  et  ah  (1984)  modified  the  definition  of  the  C.D.F  at  the  i'1,  order  statis- 


119 


tic  to  obtain  a  modified  K-S  test  statistic  when  the  probability  mode]  is  comph  ieh 
specified.  They  have  shown  that  their  proposed  test  is  more  powerful  than  tin-  u-uni 
K-S  tests  for  small  to  moderate  sample  sizes. 

New  goodness  of  fit  tests  for  symmetric  alternatives  were  obtained  by  Moore 
for  the  normal  distribution  by  using  a  reflection  technique  in  which  the  data  points 
are  reflected  about  an  invariant  estimate  of  the  mean  and  is  used  lo  double  i  In- 
sample  size.  New  tables  were  derived  for  the  K-S.  AD  and  C'vM  statistics. 

A  similar  work  was  done  by  Woodruff  et  al.  for  the  uniform  distribution  and 
by  Yen  and  Moore  for  the  Laplace  distribution. 

As  a  final  note,  a  problem  arises  when  a  goodness  of  fit  test  fails  to  rejet  t  two 
families  of  distributions  which  means  that  the  test  does  not  sufficiently  discriminat ■ 
these  two  families.  Bain  used  a  likelihood  ratio  test  to  discriminate  normal  verm- 
two  parameter  exponential;  normal  versus  double  exponential;  normal  versus  Cauchy 
or  Weibull  versus  lognormal  and  extreme  value  versus  normal. 


Methodology 

Two  basic  test  Statistics  are  used  in  this  application.  This  statistics  arc  based 
on  the  Cramer  von  Mises  and  the  Anderson  Darling  statistics. 

(1)  Cramer  von  Mises  statistic 


The  C’ramer-von  Mises  is  defined  as  B  ‘  =  f 

Tl  J 


statistic  has  the  well  known  classical  results  that: 


F.A.r)  -  FJ.v) 


(IFj.r ).  Tim 


120 


-  is  a  directed  distance. 

i.e  for  any  proper  distribution  function  F\(x)undF2(x): 

W„(Fi ,  F2)  =  0  <=>  Fi(.r)  =  F2(x)  (1:'F>i 

and 

U"Z(Fl,F2)  +  W?l(Fl,r)>  WZ(FUF-)  ( i :5<>  1 

-  V1”Tj  is  symmetric. 

i.e  if  Fi(.)  and  F2(.)  are  continuous  then: 

W^FuF2)  =  W2(F2,F\)  ( I  -IT ; 

-  Since  0  <  F(r)  <  1  =>  0  <  \Y2{F\,  F2)  < 

(2)  Anderson  -  Darling  .4  2 

The  AD  statistic,  considered  one  of  the  Cramer- von  Mises  family,  is  defined 
as: 

OO 

Q  =  J  {Fn(x)  -  F(x)}2  T(x)  d,  F(x)  (IdS) 

-OO 

where  ^(x)  is  some  function  that  weights  the  square  of  the  difference  between  both 
distribution  functions.  The  CvM  statistic  sets  this  weight  equal  to  1.  W  hile  the  Alt 


121 


statistic  uses  this  weight  as  the  ratio  between  F(x)  and  l-F(x) 


,-l2  =  —  n  -  ( J/»)  ^(2/  -  1 )  \lo(j  (FfA'i,)))  +  log  1 1  (A (  I  i't  ■ 

1=1 

.4 2  =  —n  —  1/n  ^  (2?  -  1)  [/??F,  +  In  (l  -  Ftt+i-i)]  (I  H>i 

t=i 

In  this  context  a  gof  test  is  run  using  M.C  size  10UU.  The  test  is  based  on  i  lie 
AD  test  statistic  where  the  nonparametric  probability  is  used  in  place  ol  t  lit*  I.Df 
The  AD  statistic  is  more  sensitive  to  the  distribution  tail  length  by  the  construction 
of  the  weight  function  above.  As  for  the  properties  of  the  E.D.F  upon  which  it  seemed 
natural  to  use  the  E.D.F  for  goodness  of  fit.  of  Fo(.r)  (  the  theoretical  di.sl  ribul  ion  - 
is  its  uniform  convergence  and  almost  surely  to  Fo(.r).  Subjectively  it  can  !><•  slated 
that  reject  if  W2  is  large  and  accept  when  it  is  small. 

In  the  application  here  F0(.t)  is  assumed  to  be  univariate  continuous  distribu¬ 
tion  function.  This  means  that  Fo(A,)  will  be  uniformly  distributed  between  (0.1). 
The  asymptotic  behavior  of  W2  when  F0(.r)  =  F(.r )  is  given  by: 

F  {n\V2  <  a-}  —  Fw?i(x)  (M  It 

where 

FHv(*)  =  — "7=  .(lid  i 

-\/x 


122 


The  technique  used  is  based  upon  the  idea  of  using  the  nonparameti  ic  density 
estimator  in  place  of  the  E.D.F.  Hence  the  goodness  of  fit  application  document' 
this  other  new  application  with  complete  test  elements. 

The  Technique  And  The  Results 

The  Monte  Carlo  procedure  for  this  test  was  divided  to  3  basic  stages 
Stage  I 

(1)  Determine  the  Critical  Values  lor  the  test  Statistic  at  the  predetermined 
significance  levels  (  .01  ,  .05  (.05)  .20  ). 

(2)  Compute  the  value  of  H’2  for  each  of  the  1000  M.C  cases  as  a  measure  of 
the  distance  between  the  parametric  density  with  the  maximum  likelihood  estimator 
for  the  parameters  (x ,s2)  and  a  noiT^rametric  fit  for  each  sample. 

(3) The  1000  M.C  samples  yields  a  corresponding  sample  of  size  1000  for  M 
-Thus,  there  are  two  ways  to  go  to  find  the  distribution  of  IV2: 

(a)  Use  a  plotting  position. 

(b)  Fit  a  continuous  nonpara.met.ric  distribution. 

(4)  The  nonparametric  fit  is  used  and  the  inverse  function  of  the  corresponding 
C’.D.F  is  computed  at  the  different  levels  of  significance. 


123 


Stage  II 


(1)  The  corresponding  power  stud}'  for  the  hypotheses  is  conducted  under  //,, 
and  the  power  is  computed. 

(2)  The  test  shows  powers  which  were  reasonably  close  to  the  n- levels 

Stage  III 

The  members  of  the  following  family  of  distributions  is  used  as  alternalive 
distributions: 

-Uniform 

-\2  with  1  d.f 

-X2  with  4  d.f 

-Exponential 

-Cauchy 

-D.E 

-t-student  with  3  d.f 
-Logistic  distribution 

This  family  of  distributions  give  a  variety  of  shapes  and  characteristics.  The 
results  from  this  part  are  shown  in  the  following  tables.  The  tables  give  the  critical 
values  for  both  cases  when  the  CvM  statistic  is  used  and  when  the  AD  statistic  is 
used.  The  tables  also  show  the  power  of  both  tests  for  different  sample  sizes. 


124 


Table  19.  Critical  Value  for  the  New  Suggested  Test 


for  Sample  Size  =  5  (5)  60 
(Using  CvM) 

(at  Significance  Levels  . 2, . 15, . 1 , . 05, . 

.  01) 

N 

0 . 20 

0.15 

0 . 10 

0.05 

.  01 

5 

0.0341 

0.0352 

0.0364 

0.0382 

0.0406 

10 

0.0335 

0.0355 

0.0387 

0.0437 

0.0507 

15 

0.0355 

0.0385 

0.0418 

0.0478 

0.0568 

20 

0.0384 

0 . 0420 

0.0455 

0 . 0533 

0.0717 

25 

0.0397 

0.0436 

0.0495 

0 . 0568 

0.0742 

30 

0.0414 

0.0459 

0.0508 

0.0599 

0.0739 

35 

0.0419 

0.0467 

0.0520 

0.0623 

0.0786 

40 

0.0447 

0.0485 

0.0554 

0.0654 

0.0874 

45 

0.0473 

0.0522 

0.0590 

0.0719 

0.0918 

50 

0.0487 

0.0534 

0.0600 

0.0695 

0.0989 

55 

0.0504 

0.0556 

0.0637 

0.0753 

0 . 0970 

60 

0.0510 

0.0563 

0.0639 

0.0771 

0.0977 

125 


Table  20.  Power  of  Tests  for  Normal  Distriution 
with  Sample  Size  =  5  (5)  60 
(Using  CvM) 

(at  Significance  Levels  .2, .15, .1, .05, .01) 


N 

0.20 

0 . 15 

0.10 

0.05 

.  01 

5 

0.2072 

0.1667 

0.1024 

0 . 0572 

0 . 0121 

10 

0.2209 

0 . 1559 

0.1083 

0.0546 

0 . 0105 

15 

0.2115 

0 . 1575 

0 . 1032 

0.0481 

0 . 0105 

20 

0.1979 

0 . 1485 

0 . 1000 

0.0521 

0.0103 

25 

0.1990 

0 . 1504 

0.0968 

0.0505 

0.0099 

30 

0.2041 

0.1455 

0.0975 

0.0515 

0.0101 

35 

0 .1940 

0 . 1489 

0.1018 

0.0503 

0.0096 

40 

0.2011 

0.1502 

0.0968 

0.0482 

0.0100 

45 

0.1997 

0.1392 

0.0957 

0.0476 

0.0102 

50 

0.1951 

0.1427 

0.1018 

0.0491 

0 . 0098 

55 

0.1917 

0.1505 

0.0986 

0.0498 

0.0096 

60 

0.1965 

0.1477 

0.0972 

0.0483 

0.0097 

126 


Table  21.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size 


5 


(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi (1) 

Chi  (4) 

Expon . 

.20 

0.2590 

0 .4480 

0.2330 

0.3170 

.15 

0.1990 

0.3860 

0.1760 

0.2530 

.10 

0.1480 

0.3270 

0.1420 

0.2040 

.05 

0.0700 

0.2510 

0.0630 

0 .1420 

.01 

0.0070 

0.1170 

0.0170 

0.0380 

Sign,  level 

Cauchy 

D.E 

t  (3 ) 

Logistic 

.20 

0.4060 

0.1630 

0.2350 

0.1640 

.15 

0.3680 

0.1190 

0.1900 

0.1220 

.10 

0.3280 

0.0850 

0.1450 

0.0840 

.05 

0.2550 

0.0490 

0.0930 

0.0430 

.01 

0 . 1640 

0.0090 

0.0320 

0.0050 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distribution  with  3  d.f 


127 


Table  22.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  10 

(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.5310 

0 . 8090 

0.4520 

0 . 6420 

.15 

0.4660 

0.7660 

0.3950 

0.5650 

.10 

0.3320 

0.6640 

0.2810 

0.4450 

.05 

0.2050 

0.5000 

0.1630 

0.2920 

.01 

0.0700 

0.2750 

0.0580 

0.1370 

Sign,  level 

Cauchy 

D.E 

Logistic 

.20 

0.5280 

0.2320 

0.2540 

0.2110 

.15 

0.5030 

0.1930 

0.2120 

0.1710 

.10 

0.4560 

0.1310 

0.1460 

0.1020 

.05 

0.3790 

0.0630 

0.1000 

0.0470 

.01 

0.2790 

0.0250 

0.0450 

0.0150 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


128 


Table  23.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  15 

(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0.6600 

0.9370 

0.5550 

0.7940 

.15 

0.5810 

0 . 9050 

0.4750 

0.7400 

.10 

0.5040 

0.8540 

0.3800 

0.6590 

.05 

0.3500 

0.7610 

0.2830 

0.5180 

.01 

0.1590 

0.5730 

0 . 1260 

0.3030 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.6390 

0.2040 

0.2560 

0.2040 

.15 

0.6110 

0.1640 

0.2060 

0.1500 

.10 

0.5810 

0.1230 

0.1590 

0 .1110 

.05 

0.5320 

0.0770 

0.1060 

0.0600 

.01 

0 .4540 

0.0360 

0.0560 

0.0190 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distribution  with  3  d.f 


129 


Table  24 . 

Power  of  Tests  for 

(Using  CvM) 
(Normal  against  one 

Normal  Distriution  with 

of  the  following  : ) 

Sample  Size  = 

Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Exp on . 

.20 

0.7530 

0.9800 

0.6740 

0 .8580 

.15 

0.6670 

0.9690 

0.5910 

0.8130 

.10 

0.5820 

0 . 9550 

0.4990 

0.7580 

.05 

/ 

0.3930 

0.8890 

0.3500 

0.6300 

.01 

0.1070 

0.6060 

0 . 1240 

0.3100 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0 . 6610 

0.1930 

0.2550 

0.1760 

.15 

0 . 6410 

0.1560 

0.2140 

0.1320 

.10 

0.6160 

0.1210 

0.1850 

0.1100 

.05 

0.5690 

0.0800 

0.1350 

0.0510 

.01 

0.4670 

0.0190 

0.0620 

0.0060 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


Table  25.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  25 

(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0 . 8360 

0.9940 

0.7570 

0 . 9250 

.15 

0.7680 

0 . 9900 

0.7100 

0.8950 

.10 

0 . 6360 

0 . 9790 

0.6090 

0.8350 

.05 

0.5050 

0 . 9520 

0.4780 

0.7520 

.01 

0 . 1910 

0.8070 

0.2340 

0.4980 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.7410 

0.1790 

0.2530 

0.1610 

.15 

0.7130 

0.1330 

0.2200 

0.1180 

.10 

0.6840 

0.0940 

0.1700 

0.0780 

.05 

0.6420 

0.0650 

0.1270 

0.0400 

.01 

0.5640 

0.0260 

0.0740 

0.0040 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t  ( 3 )  =  t-distribution  with  3  d.f 


131 


Table  26 . 

Power  of  Tests  for 

(Using  CvM) 
(Normal  against  one 

Normal  Distriution  with 

of  the  following  : ) 

Sample  Size  = 

Sign,  level 

Uniform 

Chi (1) 

Chi (4) 

Expon . 

.20 

0 .8720 

0.9980 

0 . 8140 

0.9690 

.15 

0.8210 

0.9950 

0.7350 

0.9530 

.10 

0.7500 

0.9880 

0.6600 

0.9290 

.05 

0.6010 

0.9740 

0.5370 

0.8540 

.01 

0.3470 

0.9200 

0.3340 

0 .6900 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.8080 

0.1680 

0.2690 

0.1430 

.15 

0.7740 

0.1390 

0.2320 

0.0990 

.10 

0.7450 

0.1000 

0.1980 

0.0680 

.05 

0.6760 

0.0690 

0.1550 

0.0330 

.01 

0.6080 

0.0360 

0 . 1120 

0.0100 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


Table  27.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size 


35 


(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi (1) 

Chi (4) 

Expon . 

.20 

0 . 9170 

1.0000 

0 . 8740 

0 . 9800 

.15 

0 . 8770 

0 .9990 

0 . 8310 

0 .9690 

.10 

0.8280 

0 .9980 

0 . 7780 

0.9470 

.05 

0.6830 

0 . 9900 

0.6440 

0.8750 

.01 

0 . 4080 

0 .9620 

0 . 4230 

0.7420 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.8840 

0.1720 

0.3040 

0.1450 

.15 

0.8610 

0.1380 

0.2630 

0.1020 

.  10 

0.8350 

0.1100 

0.2380 

0.0750 

.05 

0.7720 

0.0780 

0.1860 

0.0370 

.  01 

0.6830 

0.0360 

0.1220 

0.0080 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


133 


Table  28.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  40 

(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0.9370 

0 . 9990 

0 . 9000 

0 .9900 

.15 

0 . 9180 

0 .9990 

0.8680 

0 . 9830 

.10 

0.8560 

0 . 9980 

0.7830 

0.9710 

.  05 

0.7380 

0.9960 

0 . 6740 

0.9320 

.01 

0.3820 

0.9750 

0.4380 

0.8260 

Sign,  level 

Cauchy 

D.E 

HH 

Logistic 

.20 

0 . 9390 

0.1710 

0.3110 

0.1290 

.  15 

0 . 9200 

0 . 1480 

0.2800 

0.0990 

.10 

0.8910 

0.1090 

0.2340 

0.0640 

.05 

0.8420 

0.0710 

0.1860 

0.0360 

.01 

0.7380 

0.0280 

0.1250 

0.0070 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


134 


Table  29.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  45 

(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.9530 

1.0000 

0.9040 

0 .9930 

.15 

0 . 9320 

1.0000 

0 . 8720 

0 . 9890 

.10 

0 . 8910 

1.0000 

0 . 8130 

0.9780 

LO 

o 

0.7390 

0.9990 

0.7010 

0 . 9520 

.01 

0.4480 

0 .9900 

0.5110 

0.8710 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

O 

CM 

0 . 9410 

0.1730 

0.3100 

0.1250 

.15 

0.9290 

0.1460 

0.2770 

0.0870 

.10 

0.9120 

0.0980 

0.2370 

0.0620 

.05 

0 . 8770 

0.0670 

0.1800 

0.0270 

.01 

0.7990 

0.0290 

0 . 1250 

0.0080 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distr ibution  with  3  d.f 


135 


Table  30.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  50 

(Using  CvM) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi (1) 

Chi  (4) 

Expon . 

.20 

0.9630 

1.0000 

0.9410 

0.9990 

.15 

0 . 9440 

1.0000 

0.9210 

0.9990 

.  10 

0 . 9170 

1  .0000 

0 . 8890 

0.9960 

.05 

0.8360 

0 .9990 

0 . 8170 

0.9860 

.01 

0.4620 

0 . 9910 

0.5450 

0 . 9040 

Sign,  level 

Cauchy 

D.E 

Logistic 

.20 

0.9610 

0.1920 

0.3260 

0.1240 

.15 

0.9540 

0.1610 

0.2970 

0.0950 

.  10 

0.9440 

0 . 1250 

0.2520 

0.0610 

.05 

0  .  9250 

0.0780 

0.1990 

0.0300 

.01 

0.8330 

0.0190 

0.1310 

0.0060 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distribution  with  3  d.f 


136 


Table  31.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  55 

(Using  CvM) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0.9810 

1.0000 

0.9600 

0.9990 

.15 

0.9600 

1 . 0000 

0.9490 

0 .9980 

.10 

0 . 9190 

1.0000 

0.9110 

0.9940 

.05 

0 . 8430 

1.0000 

0.8400 

0 . 9870 

.01 

0.5910 

1.0000 

0.6620 

0.9580 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.9800 

0.2030 

0.3420 

0.1050 

.15 

0 . 9730 

0 . 1710 

0.3180 

0.0820 

.10 

0 . 9600 

0 . 1190 

0.2760 

0 .0440 

.05 

0 . 9440 

0.0690 

0.2340 

0.0200 

.01 

0.8970 

0.0300 

0.1680 

0.0040 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


137 


Table  32.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size 


60 


(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0.9830 

1.0000 

0.9810 

1.0000 

.15 

0.9720 

1 . 0000 

0 . 9770 

0 . 9990 

.10 

0.9480 

1.0000 

0 . 9550 

0 . 9980 

.05 

0.8780 

1.0000 

0.8890 

0.9960 

.01 

0.6750 

0.9990 

0.7290 

0.9740 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.9850 

0.2180 

0.3590 

0.1120 

.15 

0.9810 

0.1700 

0.3200 

0.0760 

.10 

0 . 9730 

0.1360 

0.2840 

0.0520 

.05 

0 . 9600 

0.0770 

0.2280 

0.0200 

.01 

0.9290 

0.0330 

0.1750 

0.0080 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distribution  with  3  d.f 


138 


Table  33.  Critical  Value  for  the  New  Suggested  Test 
for  Sample  Size  =  5  (5)  60 
(Using  AD) 

(at  Significance  Levels  .2,  .15,  .1,  .05,  .01) 


N 

0.20 

0 .15 

0.10 

0.05 

.  01 

5 

1  .  1629 

1.2454 

1 .4128 

1.6494 

2 . 1284 

10 

1.5496 

1.6466 

1.7901 

2 . 1274 

2 . 8964 

15 

1.9488 

2.0551 

2.1764 

2.4629 

3.2314 

20 

2.2129 

2.3563 

2.5602 

2.8515 

3.6844 

25 

2.4551 

2.5653 

2.7135 

3.0238 

3.6552 

30 

2.6596 

2 .7707 

2.9593 

3.2455 

4.1849 

35 

2.8755 

2.9862 

3.1885 

3.4801 

4.2493 

40 

3.0863 

3.2155 

3.3152 

3.7069 

4 . 3719 

4  5 

3.2613 

3.4019 

3.5699 

3.8462 

4 . 6924 

50 

3.4458 

3.5522 

3.7423 

4 . 0242 

4 . 6589 

55 

3.6080 

3.7104 

3.9092 

4 . 1381 

4.8513 

60 

3.7494 

3.8857 

4.0693 

4.3398 

4 . 9610 

139 


Table  34 .  Power  of  Tests  for  Normal  Distriution 
with  Sample  Size  =  5  (5)  60 
(Using  AD) 

(at  Significance  Levels  .2, .15, .1, .05, .01) 


N 

0.20 

0 .15 

0.10 

0.05 

.01 

5 

0.1640 

0 .1170 

0.0600 

0 . 0270 

0 . 0050 

10 

0.2140 

0 . 1700 

0 .1100 

0.0520 

0.0130 

15 

0.1950 

0  .  1580 

0.1140 

0.0660 

0.0090 

20 

0 .1780 

0.1140 

0.0630 

0.0290 

0 . 0030 

25 

0 .1760 

0.1400 

0.1030 

0.0490 

0 .0140 

30 

0 . 1850 

0.1360 

0.0830 

0.0400 

0.0080 

35 

0.1910 

0.1490 

0.0870 

0.0440 

0.0040 

40 

0.1820 

0.1340 

0.1070 

0.0340 

0.0050 

45 

0.1680 

0.1110 

0.0660 

0.0310 

0.0002 

50 

0 .1770 

0.1280 

0.0780 

0.0460 

0.0050 

55 

0.1870 

0.1420 

0.0960 

0.0510 

0.0100 

60 

0 .1850 

0  .  1280 

0.0900 

0.0420 

0.0100 

140 


5 


Table  35.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.1590 

0.6210 

0.3760 

0.4950 

.15 

0 . 1140 

0.5810 

0.3190 

0.4460 

.10 

0.0540 

0.4960 

0.2220 

0.3400 

.05 

0.0160 

0.3710 

0.1380 

0.2280 

.01 

0.0050 

0.2160 

0.0310 

0.1000 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.4590 

0.1860 

0.2740 

0.1510 

.15 

0.4210 

0.1320 

0.2160 

0.0980 

.10 

0.3370 

0.0720 

0.1440 

0.0570 

.05 

0.2330 

0.0410 

0.0820 

0.0220 

.  01 

0.1150 

0.0090 

0.0390 

0.0060 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


141 


Table  36.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  10 

(Using  AD) 

(Power  from  K-S  test  in  brackets  and  with  *  when  better) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.2170 
(  .2688) 

0 . 9410* 

( .7850) 

0.6460* 

(.4138) 

0.8120* 

(  .5710) 

.15 

0.1690 
(  .2112) 

0.9280* 

( .7366) 

0.5920* 

(.3488) 

0.7670* 

(  .5120) 

.10 

0.1090 

(.1420) 

0 .8970* 
(.6608) 

0.5320* 

(  .2716) 

0.7180* 

(  .4318) 

.05 

0.0510 

(.0724) 

0 . 8240* 

( .5420) 

0.3720* 

( . 1806) 

0 . 5790* 

(  .3208) 

.01 

0.0080 
(  .0128) 

0.6350* 

( .3430) 

0.1700* 

(  .0708) 

0.3420* 

(  .1612) 

Sign,  level 

Cauchy 

D.E 

wmm 

Logistic 

.20 

0.6980 

(.7306) 

0.3200 
( .3604) 

0.3750* 

(.3610) 

0.2530* 

( .2486) 

.15 

0.6570 
( . 6998) 

0.2760 
( .3030) 

0.3190* 

(.3066) 

0.1860 
(  .1990) 

.10 

0 . 5870 
(  .  6532) 

0.2090 
(  .2376) 

0.2540* 

( .2500) 

0.1240 
(  . 1418) 

.05 

0.4510 
( . 5884) 

0.1060 

(.1572) 

0.1540 
( .1726) 

0.0590 
(  .0874) 

.01 

0.2830 
(  .4660) 

0.0340 
(  .0646) 

0.0590 
( .0838) 

0.0150 

(.0252) 

Chi(k)  =  Chi  square  with  k  d.f 


Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distribution  with  3  d.f 


142 


15 


Table  37 

.  Power  of  Tests  for 

(Normal  against 

Normal  Distriution  with  Sample 
(Using  AD) 

one  of  the  following  : ) 

Size  =  15 

Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.2200 

0 . 9910 

0.7400 

0.9190 

.15 

0 . 1720 

0 . 9890 

0.6880 

0.8960 

.10 

0 . 1250 

0 . 9830 

0.6400 

0.8650 

.05 

0.0590 

0 . 9590 

0.5250 

0.7910 

.01 

0.0100 

0.8780 

0.2730 

0.5940 

Sign,  level 

Cauchy 

D.E 

HIM 

Logistic 

.20 

0.8500 

0.3430 

0.3940 

0.2240 

.15 

0.8090 

0.2940 

0.3430 

0.1730 

.10 

0 .7780 

0.2320 

0.2940 

0.1210 

.05 

0.7000 

0.1430 

0.2110 

0.0640 

.01 

0.5460 

0.0370 

0.0880 

0.0180 

Chi(k)  = 

Chi  square 

with  k  d. 

f 

Expon  =  Negative  Exponential 

D.E  =  Double  exponential 

t(3)  =  t-distr ibution  with  3  d.f 


143 


20 


Table  38.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0 . 2440 

0.9980 

0.8700 

0.9660 

.15 

0 . 1570 

0.9980 

0.8240 

0 . 9500 

.10 

0.1010 

0.9960 

0.7590 

0 . 9210 

.05 

0.0500 

0 . 9890 

0.6430 

0.8760 

.01 

0.0080 

0.9580 

0.3610 

0.7110 

Sign,  level 

Cauchy 

D.E 

9BUHI 

Logistic 

.20 

0.9160 

0.4010 

0 .4640 

0.2470 

.15 

0.8900 

0.3490 

0.3970 

0.1780 

.10 

0.8570 

0.2520 

0.3280 

0.1100 

.05 

0.7900 

0 . 1550 

0.2430 

0.0510 

.01 

0.6410 

0.0370 

0.1090 

0.0120 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 


t ( 3 )  =  t-distribution  with  3  d.f 


25 


Table  39.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  - 

(Using  AD) 

(Power  from  K-S  test  in  brackets  and  with  *  when  better) 
(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi (1) 

Chi  (4) 

Expon . 

.20 

0.2800 

1 . 0000* 

0 . 9170* 

0.9960* 

( .3704) 

{ .9904) 

( . 6566) 

( .8914) 

.15 

0.2060 

1 . 0000* 

0 . 8850* 

0.9910* 

(.2998) 

( .9860) 

( .5974) 

(  .8528) 

.10 

0 . 1440 

1.0000* 

0.8570* 

0.9870* 

(  .2156) 

(  .  9738) 

( .5146) 

(  .7960) 

.05 

0.0710 

0 .9990* 

0.7590* 

0.9550* 

( . 1172) 

{ . 9492) 

( .3872) 

( . 6882) 

.01 

0.0220 

0 . 9940* 

0.5710* 

0.8690* 

(.0294) 

(  .8484) 

( .1932) 

(  .4536) 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.9680* 

0.4710 

0.5350* 

0.2630 

(  . 9559) 

( .5084) 

( .5138) 

( .2670) 

.15 

0.9600* 

0.4040 

0.4680* 

0.2080 

( . 9452) 

( .4402) 

( .4596) 

( .2150) 

.10 

0.9530* 

0.3250 

0.4080* 

0 .1570* 

(  . 9298) 

(.3618) 

(  .3866) 

( .1494) 

.05 

0.9190* 

0.2150 

0.2970 

0.0880* 

( . 9000) 

( .2566) 

( .3004) 

( .0876) 

.01 

0.8210* 

0.0850 

0.1620 

0.0200 

(.8385) 

( .1196) 

( .1700) 

( .0244) 

Chi(k)  =  Chi  square  with  k  d.f 


Expon  =  Negative  Exponential 

D.E  =  Double  exponential 

t(3)  =  t-distribution  with  3  d.f 


145 


30 


Table  40.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Normal  against  one  of  the  following  : ) 


i.  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

20 

0.3280 

1.0000 

0.9410 

0 .9970 

15 

0.2340 

1.0000 

0.9230 

0 .9970 

10 

0 . 1570 

1.0000 

0 . 8790 

0.9940 

05 

0.0750 

1.0000 

0.8180 

0 . 9890 

01 

0.0130 

0.9960 

0.5540 

0 . 9130 

Sign,  level  Cauchy  D.E  t ( 3 )  Logistic 


.20 

0.9840 

0.5210 

0.5960 

0.2740 

.15 

0.9820 

0.4560 

0 . 5440 

0.2200 

.10 

0.9590 

0.3530 

0.4570 

0.1540 

.05 

0.9420 

0.2390 

0.3390 

0.0880 

.01 

0.8370 

0.0650 

0.1600 

0.0120 

Chi (k) 

=  Chi  square  with  k  d.f 

Expon  =  Negative  Exponential 

D.E  =  Double  exponential 

t(3)  =  t-distribution  with  3  d.f 


146 


Table  41.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  35 

(Using  AD) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0.3340 

1.0000 

0 . 9670 

1.0000 

.15 

0.2650 

1.0000 

0.9550 

0 .9990 

.10 

0 . 1630 

1.0000 

0  .  9270 

0 . 9960 

.05 

0.0900 

1.0000 

0 .8740 

0.9910 

.01 

0.0140 

0.9990 

0.7180 

0.9510 

Sign,  level 

Cauchy 

D.E 

KUBI 

Logistic 

.20 

0.9920 

0.5640 

0 . 6460 

0.2810 

.15 

0.9920 

0.4950 

0.5960 

0.2310 

.10 

0.9840 

0.3910 

0.4990 

0 .1450 

.05 

0.9690 

0.2600 

0.3960 

0.0870 

.01 

0. 917u 

0.0810 

0.2190 

0.0170 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


147 


40 


Table 


42.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Power  from  K-S  test  in  brackets  and  with  *  when  better) 
(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0 . 3770 
( . 5284) 

1 .0000* 

( . 9896) 

0.9880* 

( .8340) 

1 .0000* 

( . 9828) 

.15 

0.2850 
( . 4482) 

1.0000* 

(  .  9844) 

0 . 9740* 

( .7910) 

1 .0000* 

( . 9752) 

.10 

0.2290 

(.3424) 

1.0000* 

( .9726) 

C .9680* 

( .7248) 

1  .0000* 

(  .9556) 

.05 

0.0890 
( . 1978) 

1.0000* 

( . 9490) 

0.9050* 

( .6036) 

0 .9990* 

( . 9074) 

.01 

0.0250 
( .0454) 

1.0000* 

( .8570) 

0.7570* 

( .3548) 

0 .9900* 

( .7204) 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.9970* 

0.6110 

0.6650* 

0.2810 

( . 9918) 

( .  6376) 

(  . 6324) 

( .2958) 

.15 

0.9960* 

0.5300 

0.5990* 

0.2180 

(  . 9888) 

( . 5858) 

(  .5892) 

( .2450) 

.10 

0.9950* 

0.4630 

0.5590* 

0 .1810* 

(  . 9862) 

(.5114) 

(  .5132) 

( .1798) 

.05 

0.9840* 

0.2870 

0.4200* 

0.0850 

( . 9766) 

( .3852) 

( .4140) 

( . 1044) 

.01 

0.9560* 

0.1090 

0.2610* 

0.0250 

( . 9498) 

(  .1820) 

( .2482) 

( .0312) 

Chi(k)  =  Chi  square  with  k  d.f 


Expon  =  Negative  Exponential 


D.E  =  Double  exponential 
t ( 3 )  =  t-distr ibution  with  3  d.f 


148 


45 


Table  43.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.4360 

1.0000 

0.9890 

1.0000 

.  15 

0 . 3440 

1.0000 

0 . 9810 

1.0000 

.10 

0.2280 

1.0000 

0.9680 

0 . 9990 

.05 

0 . 1410 

1.0000 

0 . 9370 

0.9990 

.01 

0.0220 

1.0000 

0.8100 

0.9920 

Sign,  level 

Cauchy 

D.E 

msm 

Logistic 

.20 

0.9990 

0.6590 

0.7130 

0.3210 

.15 

0.9970 

0.5640 

0.6540 

0.2410 

.  10 

0.9960 

0.4730 

0.5770 

0.1760 

.05 

0 . 9900 

0.3480 

0.4500 

0.0900 

.01 

0.9630 

0.1030 

0.2350 

0.0180 

Chi ( k)  =  Chi 

square  with  k  d.f 

Expon  =  Negative  Exponential 

D.E  =  Double  exponential 

t(3)  =  t-distribution  with  3  d.f 


149 


Table  44.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  50 

(Using  AD) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.4630 

1.0000 

0.9900 

1.0000 

.  15 

0 . 3860 

1.0000 

0.9850 

1 .0000 

.10 

0.2800 

1.0000 

0.9720 

1.0000 

.05 

0.1510 

1.0000 

0.9580 

1.0000 

.01 

0.0390 

1.0000 

0.8960 

0 .9980 

Sign,  level 

Cauchy 

D.E 

Logistic 

.20 

0.9990 

0.7140 

0 . 7270 

0.3390 

.  15 

0 . 9990 

0.6510 

0.6810 

0.2720 

.  10 

0 . 9990 

0.5420 

0.5940 

0.1900 

.05 

0.9970 

0.3990 

0.4890 

0.0970 

.01 

0 . 9820 

0 .1710 

0.3280 

0 .0270 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


150 


55 


Table  45.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

U lif orm 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0 . 5140 

1.0000 

0.9970 

1.0000 

.  15 

0. 4340 

1.0000 

0.9970 

1.0000 

.10 

0.3090 

1.0000 

0.9940 

1.0000 

.05 

0 . 1930 

1.0000 

0.9840 

1.0000 

.01 

0.0430 

1.0000 

0.9280 

0 . 9980 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

1.0000 

0.7450 

0.7480 

0.3510 

.15 

0.9990 

0.6850 

0.7080 

0.2930 

.  10 

0.9990 

0.5720 

0.6300 

0.2060 

.05 

0.9990 

0.4550 

0.5480 

0.1210 

.01 

0 . 9910 

0.1910 

0.3620 

0.0260 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distr ibution  with  3  d.f 


151 


60 


Table  46.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Power  from  K-S  test  in  brackets  and  with  *  when  better  or  the  same) 
(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0 . 5900 

1.0000* 

1.0000* 

1.0000* 

( . 6800) 

(1.000) 

( . 9348) 

(  . 9994) 

.15 

0 .4650 

1 .0000* 

0.9990* 

1 .0000* 

( . 6012) 

(1.000) 

( . 9132) 

(  .9984) 

.10 

0.3380 

1 . 0000* 

0 . 9970* 

1.0000* 

( .4918) 

(1.000) 

( .8640) 

(  .9960) 

.05 

0.1980 

1.0000* 

0 .9940* 

1 .0000* 

(.3038) 

(1.000) 

{ .7648) 

(  .9838) 

.01 

0.0610 

1.0000* 

0 . 9640* 

1.0000* 

(.0952) 

(  .9998) 

( .5392) 

(  .9312) 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

1 . 0000* 

0.7710* 

0.7810* 

0.3780* 

(  . 9994) 

(  .7536) 

(  . 7512) 

(  .3306) 

.15 

1.0000* 

0.7010 

0 . 7400* 

0.2940* 

(  . 9990) 

(.7036) 

(.7024) 

(.2736) 

.  10 

1.0000* 

0.6130 

0.6700* 

0 .2130* 

( . 9986) 

(  .  6264) 

( . 6356) 

(  .1990) 

.05 

1.0000* 

0.4660 

0.5780* 

0.1090 

(.9970) 

(.4816) 

(.5282) 

( .1130) 

.01 

0 . 9940* 

0 . 2290 

0.3800* 

0 . 0270 

( . 9926) 

(  .2664) 

( .3632) 

(.0342) 

Chi(k)  =  Chi  square  with  k  d.f 


Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t  ( 3 )  =  t-distr ibution  with  3  d.f 


152 


Thus,  this  application  defines  a  new  modified  goodness  of  fit  test  based  on  the 
nonparametric  kernel  density  estimator.  Both  the  CvM  and  AD  statistics  are  used. 
The  critical  values  are  derived  by  Monte  Carlo  experiment.  Then  the  power  of  tin- 
test  for  the  case  of  the  CvM  and  the  AD  statistics  is  obtained  when  the  under Iving 
distribution  is  normal.  This  power  show's  a  value  which  is  close  to  the  signilVaiwv 
level.  The  test  is  then  performed  against  each  of  the  eight  different  alternatives. 
The  power  for  the  different  distributions  using  the  CvM  statistic  shows  an  jncresing 
power  w'ith  sample  size.  The  test  discriminates  all  other  distributions  with  high 
powers,  however  it  does  not  do  as  w'ell  for  the  double  exponential  and  the  logisth 
distribution.  The  modified  test  using  the  AD  statistic  gives  better  power  than  the 
test  based  on  the  CvM  statistic  for  the  different  alternatives  except  for  the  uniform 
distribution  due  to  the  fact  that  the  AD  statistic  is  more  sensitive  to  the  tails  of  the 
distributions  than  the  CvM.  The  results  from  the  power  of  the  test  using  AD  st  atist  i< 
are  compared  to  those  of  the  classical  K-S  test  for  sample  sizes  10.  25.  -10.  and  0(1. 
The  power  from  the  new'  modified  test  using  the  AD  statistic  show's  an  improvement 
over  the  classical  K-S  test  in  all  cases  except  for  the  uniform  distribution. 


153 


VIII.  Adaptive  Nonparametric  Kernel  Density  Estimation 


Application 

Introduction 

In  this  chapter  an  "adaptive”  approach  for  the  density  estimation  is  introduced. 
This  approach  is  based  on  a  given  criteria  according  to  which  a  suitable  or  near 
optimal  adaptive  choice  of  the  window  width  is  to  be  used  for  the  kernel  fit. 

The  general  strategy  for  this  application  is  to  generate  different  samples  from 
various  distributions.  For  each  sample  a  criteria  to  classify  or  discriminate  the  parent 
distribution  from  which  the  sample  is  drawn  is  computed.  Based  on  the  criteria,  a 
suitable  choice  of  the  window  width  for  each  case  is  found.  The  chosen  h  value  is 
considered  an  adaptive  choice  in  this  case  since  it  varies  with  the  computed  sample 
criteria.  As  the  adaptive  choice  for  the  h  parameter  is  found  a  nonparametric  kernel 
estimator  for  the  underlying  density  will  be  estimated. 

Percentile  Ratios 

For  the  development  of  this  application,  a  discriminant  was  needed.  The  kur- 
tosis,  Hogg’s  Q  statistic,  and  the  percentile  ratios  are  examples  of  such  discriminants 
that  could  be  used.  Since  both  the  kurtosis  and  the  Q  statistic  average  the  measure 
for  the  upper  and  lower  tail  lengths,  they  are  not  compatible  with  the  asymmetric 
distributions.  The  percentile  ratios  were  chosen  to  be  used  as  a  discriminant  since 


154 


it  measures  both  tail  lengths  separately.  The  upper  and  lower  tail  lengths  are  mea¬ 
sured  for  the  distribution  by  the  upper  and  lower  percentile  ratios  which  are  defined 
respectively  to  be: 


F-l(.97r>)  -  F~l(.5) 
F-'(JS)  -  F~l{.5) 


(143) 


F~l(. 5)  -  F~1(.025) 
5)  -  F-1  (.25) 


(144) 


where 


F  ’(a)  ....  represents  the  a  percentile  of  the  distribution. 

The  population  percentile  ratios  for  some  distributions  with  scale  parameter 
zero  and  shape  parameter  1  is  given  in  the  following  table 

Table  47.  Values  of  Percentile  ratios  for  Different  Distributions 


Distribution 

P, 

P« 

Uniform 

1.900 

1.900 

Logistic 

3.343 

3.343 

Exponential 

1.647 

4.322 

Double  Exponential 

4.322 

4.322 

Cauchy 

12.706 

12.706 

Normal 

2.904 

2.904 

Beta(l/2,l/2) 

1.409 

1.409 

The  median  rank  is  used  to  find  the  a-  sample  percentiles  based  on  a  sample 


155 


of  size  GO.  This  gives  the  sample  percentile  ratio  as: 


p» 

Ol  -  O3 

a.,  -  «3 

(145) 

Pi 

«3  -  U 1 

n3  -  a 2 

(146) 

where 


a  l  =  -19A(i)  +  .8lA(2) 

(147) 

a2  =  -6A'(i5)  +  -4  Ar(1G) 

(148) 

03  =  .5A(j)  -f  .5A'(2) 

(149) 

04  =  .399A^(i)  T  .60lA(2) 

(150) 

05  =  .81A(!)  -f  ,19A(2) 

(151) 

where  App.-the  Ith  order  statistic 

To  fit  a  nonparametric  distribution  to  the  given  data  using  the  kernel  estima¬ 
tion,  it  is  required  to  find  the  value  of  the  window  width  h  to  be  used.  A  numerically 
optimal  value  for  various  distributions  for  a  sample  size  20  is  found  in  Chapter  IV. 
The  form  used  for  the  h  value  is: 


/l^p(  —  A  C  S  Tl 


_L 


(152) 


156 


where  k...is  a  constant  that  varies  from  one  distribution  to  another. 
c...is  an  adjusting  factor  for  the  unbiasedness  of  the  s. 
s...is  the  sample  standard  deviation. 
n...is  the  sample  size. 

For  this  application  a  sample  size  n=60  is  used  for  the  adaptive  method.  The 
form  for  the  optimal  h  is  the  same  as  in  chapter  IV.  The  adjusting  factor  c  is  1.0133. 
The  folllowing  table  gives  the  values  of  k  for  different  distributions. 

Table  48.  Suggested  k  for  the  h  value 


Distribution 

k 

Uniform 

1.0589 

Logistic 

1.4821 

Exponential 

.5334 

Double  Exponential 

.8376 

Cauchy 

.9657 

Normal 

1.1789 

An  Adaptive  Methodology 

A  Monte  Carlo  experiment  of  size  1000  was  performed  on  a  sample  size  60  to 
find  the  average  and  the  standard  deviation  of  the  sample  percentile  ratios  for  some 
distributions.  The  next  table  shows  the  resulting  sample  average  upper  and  lower 
percentile  ratios  with  standard  deviation  given  in  brackets. 


157 


Table  49.  Avearge  sample  percentile  ratios 


Distribution 

Pi 

mm 

Uniform 

1.9750 

(.3937) 

1.9513 

(.4042) 

Exponential 

1.7167 

(.3172) 

4.2838 

(1.5324) 

Cauchy 

83.9780 

(634.0062) 

13.8451 

(16.5956) 

Double  Exponential 

5.2356 

(2.0799) 

4.1618 

(1.4824) 

Logistic 

3.9424 

(1.3424) 

3.2630 

( 1.0109 ) 

Normal 

3.2831 

(.9372) 

2.8688 

(.7899) 

The  adaptive  nonparametric  density  estimation  application  procedure  started 
by  generating  1000  samples  each  of  size  60  from  the  above  distributions.  The  sample 
percentile  ratios  for  each  sample  are  then  computed.  A  piecewise  linear  relation 
based  on  the  three  two  tuples  (p.k)  from  the  uniform,  normal,  logistic  distributions, 
where  p  and  k  represent  the  percentile  ratios  and  the  constant  defined  earlier  for 
these  distributions  respectively  is  used. 

The  support  is  subdivided  into  three  subsets  S\,  S2  and  S3  such  that  Uj=^Sj  = 


158 


Table  50.  M1SE  for  the  adaptive  technique  (  with  standard  deviation  in  brackets) 


Distribution 

M  lSEadap 

MISEm 

Uniform 

.08237 

.08267 

(.01891) 

(.02017) 

Logistic 

.03244 

.03904 

(.01552) 

(.01763) 

Normal 

.00870 

.008S3 

(.00621) 

(.00625) 

Tv  and  such  that: 


Si  = 

{x|x  <  F_1(.25)} 

(153) 

S2  = 

{x|F-I(.2o)  <  x  <  F-*(.75)} 

(154) 

S3  = 

{x|x  >  F-1(.75)} 

(155) 

The  h  value  is  chosen 

to  vary  with  each  subset  of  the  support. 

The  h  is 

empircally  chosen  to  b^  a  function  of  the  distribution  tail  length,  in  the  sense  of 
choosing  different  values  of  the  h  for  each  of  the  three  subsets  of  the  support.  This  is 
done  by  interpolating  the  piecewise  relation  for  the  measured  Pi  and  Pu  and  finding 
the  corresponding  k. 

The  results  from  this  chapter  are  shown  in  table  50.  The  table  gives  MISE  for 
this  adaptive  approach  given  as  MISEadapt  and  the  MISE  from  chapter  III  where 
the  estimator  for  the  window  width  was  sn's.  The  table  shows  that  the  adaptive 


159 


method  is  doing  slightly  better  in  the  case  of  uniform  and  normal  distributions,  while 
for  the  logistic  distribution  the  method  gives  20%  improvement  in  the  MISE  over 
that  of  chapter  III.  This  result  depicts  that  the  adaptive  technique  which  is  applied 
for  different  sample  size  (60)  is  working  with  the  values  of  the  constant  obtained 
from  the  Monte  Carlo  experiment  in  chapter  IV,  and  hence  could  be  used  in  those 
applications  that  require  no  assumption  about  the  distribution  form.  Hence  this 
chapter  gives  another  tool  for  applications  besides  the  ones  discussed  in  the  earlier 
chapters. 


160 


Appendix  A.  Generation  of  random  deviates 


1.  Cauchy  Distribution 

The  probability  density  function  is  given  by: 

f(x)  =  6/x  |(x  —  a)2  +  &2j  a,x£7Z,0<b<OG 


with 


mode(x)  =  a  ,  median(x)  =  a 


F(x)  =  /  6/ 7r  —  a)2  +  62j  dx 

—  OO 

(x  —  a) 


=  —tan' 

7T 


+  .5 


and  the  generated  deviate  will  be  given  by: 


x  =  btan  (u  —  .5)]  +  a 


Also,  it  could  be  generated  using  the  fact  that  if  (xi,x2)  are  uniformly  distributed 
in  a  circle  centered  at  the  origin  then  Ui/t>2  will  be  Cauchy  distributed. 


161 


2.  Logistic  Distribution 


/(x)  =  exp  [—  (x  —  a)  /ij  /  [i  (1  +  exp  [—  (x  —  a)  /A])2] 

with  a  C.D.F 


exp[— (x  —  a) /b\ 

with 

E(x)  =  a  .  V(x)  =  ,  mode(x)  =  a 

with  variates  generated  by: 


x 


3.  Weibull  Distribution 

The  3-parameter  Weibull  density  function  is  given  by: 


exp 


,6  <  x.0J>  0 


with  expected  value 


E(x)  =  6  +  0  r 


ill) 

P  J 


162 


and  with  variance 


!-(,•)  -  0 2 


3  +  2' 


r2 


3+  l 

3 


where  F  denotes  the  gamma  function, 
and  C.D.F 


F(x)  —  1  —  e 


(^) 


(3 


and  the  variates  generated  by: 


.r  =  -exp 


In  (—In  (1  —  R)) 

3 


163 


BIBLIOGRAPHY 


(1)  Anderson.  G.  D.  A  comparison  method  for  estimating  a  probability  density. 
Pli.D.  dissertation,  University  of  Washington  1909 

(2)  Bain,  L.  J.  Statistical  Analysis  of  reliability  and  life  testing  models  (Theory 
and  methods).  Marcel  Dckker,  New  York. 

(3)  Bean.  S.  J.  and  Tsokos,  C.  P.  Developments  in  nonparametric  density 
estimation.  Intern.  Statist,  Rev.  48:  215-235  (1980) 

(4)  Bickel.  P.  .J.:  Doksum,  K.  A.  Mathematical  Statistics  Molderi-Day,  Inc. 
Oakland.  California.  (1977) 

(5)  Birge.  Lucien  On  estimating  a  density  using  H  el  linger  distance  and  some 
other  strange  facts.  Prob.  Theory  and  Related  Fields  71:271-291  (1986) 

(6)  Bosq,  D.  Contribution  a  le  theorie  de  Pestimation  fonctionelle.  Pnbl.  Instit. 
Stat.  Univ.  Paris  19:1-177  (1970) 

(7)  Boswell.  Stephen  Nonparametric  estimation  of  the  modes  cl  high-dimensional 
densities,  computer  Sc.  and  Stat.  tProceedings  of  the  Sixteenth  Syinposium:21 7-225 
(1980) 

(8)  Bowman.  A.  W.  A  comparative  study  of  some  kernel  based  nonparametric 
density  estimators  .  dournal  of  stat ,  simul.  21  :3 1 3-327  (1985) 

(9)  Burke,  M.:  Horvath,  L.  Density  and  failure  rate  estimation  in  a  lornpeting 
risks  model  preprmt  Dept,  of  Math,  and  Statist..  University  of  Calgarv.  Canada. 


(1982) 


(10)  Bush,  ,1.  G;  Woodruff,  B.  W.;  A.  II.  Moore  et  al  Modified  CvM  and  AD  test 
for  Weibul!  distribution  with  unknown  location  and  scale  parameters.  Comm  in  St  at . 
A  12:2463-2476  (1983) 

(11)  Calitz,  Frid  An  alternative  to  the  Kolmogorov-Smirnov  test  for  goodness 
of  fit.  Comm,  in  Stat.  A.  16:35 19-  3534  (1987) 

(12)  Cencov,  N.  N.  Evaluation  of  an  unknown  distribution  density  from  obser¬ 
vations.  Soviet.  Math.  3:1559-1502  (1962) 

(13)  Cheng,  P.  E.  A  nearest  neighbor  hazard  rate  estimator  for  randomly  cen¬ 
sored  data.  Comm,  in  Stat.  B  16:613-  625  (1987) 

(14)  Cheng,  P.  E.  Hazard  rate  estimation  under  the  simple  proportional  hazards 
model  Bulletin  of  the  Institute  of  Math..  Academia  Sincia  15:153-162  (1987) 

(15)  -  ;  Lin,  G.  W.  Maximum  likelihood  estimation  of  a  survival  function 

under  the  Ivozoil-Green  proportional  hazard  model.  Stat.  and  Prob.  Letters  5:75-80 
(1987) 

(16)  D'Agostino,  R.  B.  ;  Stephens,  M.  (Ed);  King,  Terry  (Rev)  Review  of 
goodness  of  fit  techniques.  Technomet.ric.s  29:493-493  (1987) 

(17)  D'Agostino,  R.  B.  ;  Stephens,  M.  (Ed)Goodness  of  fit  techniques.  Marcel 
Dekker.  Inc.  New  York  and  Basel  (1986) 

(18)  De  Montricher,  G.  F.;  Tapia,  R.  A.  and  Thompson,  J.  R.  Nonparametric 

1 65 


maximum  likelihood  estimation  of  probability  densities  by  penalty  function  methods. 
Ann.  Statist.  3:1329-1348  (1975) 

(19)  Deli  nee,  J.  Robust  density  estimation  through  distance  measurements. 
Ecology.  Duke  Univ  Press,  College  Station.  Pox  6697,  Durham,  NC  27708  67:1576-1581 
(1986) 

(20) Dennis  J.  E.  JR.  ;Sclmabel,  R.  R. Numerical  Methods  for  1  Tnconst rained 
optimization  and  Nonlinear  equations.  Prentice  Hall,  Inc.,  Englewood  Cliff,  N.J. 
07632(1982) 

(21)  Devrove,  Luc  A  course  in  density  estimation.  Birkhauser  Boston,  Inc., 
Boston,  Mass  (1987) 

(22)  - ;  Penard,  C.  S.  The  strong  uniform  convergence  of  multivariate  kernel 

estimates,  the  Canadian  J.  of  Statistics  14:211-219  1986 

(23)  - - Machell,  Fred  Data  structures  in  kernel  density  estimation.  IEEE 

Transactions  on  Pattern  Analysis  and  Machines  Intelligence  7:360-366  (1985) 

(24)  Diggle,  Peter;  Hall,  Peter  The  selection  of  terms  in  an  orthogonal  series 
density  estimator.  JASA  81:230-  233  (1986) 

(25)  Dodge,  Yadolah  Some  difficulties  involving  nonparametric  estimation  of  a 
density  function.  Journal  of  Official  Stat,  2:193-202  (1986) 

(26)  Edgeman,  R.  L.;  Scott,  R.  C.  Critical  value  approximation  for  an  E.D.F- 
based  test  of  the  inverse  Gaussian  density.  Comp.  Sc.  and  Stat:proc.  of  the  19f/l 


166 


Symp.  on  the  Interface:540-542  (1987) 


(27)  Eeden,  Constance  van  Mean  integrated  square  error  of  kernel  estimators 
when  the  density  and  its  derivative  are  not  necessarily  continuous.  Annals  of  { he 
Institute  of  Statistical  Mathematics  37:461-472  (1985) 

(28)  Efroimovich,  S.  Yu  Nonparametric  estimation  of  a  density  of  unknown 
smoothness.  Theory  of  Prob.  and  its  Applications  30:557-568  (198G) 

(29)  Evans,  J.  W.;  Johnson,  R.  A.;  Green,  D.  W.  Two  and  three  parame¬ 
ter  Weibull  goodness-of-fit  tests.  Joint  Stat.  Meeting,  ASA  14S(/‘  Annual  meeting, 
Diometric  Society  Eastern  and  Western  North  Amer.  Region  (1988) 

(30)  Fletcher,  R.  Practical  Methods  of  Optimization  second  edition  John  Wi¬ 
ley  and  Sons.  (1987) 

(31)  Fix,  E  and  Hodges,  J.L.  Discriminatory  analysis,  nonparametric  estima¬ 
tion:  consistency  properties.  Report  No.  4,  Project  no.  21-49-004,  USAF  School  of 
Aviation  Medicine,  Randolph  Field,  Texas,  1951 

(32)  Fryer,  M.  J.  A  review  of  some  non-parametric  methods  of  density  estima¬ 
tion.  J.Inst  math.  Appl.20:335-354  (1977) 

(33)  Fuchs,  R.  P.  A  non-parametric  probability  density  estimator  and  some 
applications.  Ph.D.  dissertation,  WPAF  Base,  Ohio  AFIT  1984 

(34)  Gajek,  Lerlaw  On  improving  density  estimators  which  are  not  bona  fide 
functions.  The  Annals  of  Statistics  14:1612-1018  (1986) 


1G7 


(35)  Gallagher,  M.  A.  and  Moore,  A.  H.  Robust  Minimum  distance  estimation 
using  the  three  patameter  Weibull.  (submitted  for  publication  in  IEEE) 

(36)  Gill,  Phillip  E.;  Murray,  W;  Wright,  M.  1 1  .Practical  Optimization  Aca¬ 
demic  Press.  (1981) 

(37)  Ghorai,  J.  K.  Nonparametric  estimation  of  probability  density  function. 
Pli.D.  Dissertation  Purdue  Univ.  1977 

(38)  Good,  U.  J.;  Gaskins,  R.A.  Nonparametric  roughness  penalties  for  prob¬ 
ability  densities.  Biometrika  58:255-277  (1971) 

(39)  - ; - —  A  nonparametric  estimation  of  probability  densities.  Vir¬ 

ginia  J.  of  Sci  23:171-193  (1971) 

(40)  Graybill,  F.  A.  Introduction  to  Matrices  with  Applications  (1969)  in  Statistics. 
Wadsworth  Publishing  Company,  Inc.  Belmont,  California 

(41)  Greblicki,  Woldzimierz;  Pawlak  Miroslaw  Pointwise  consistency  of  the 
Hermite  series  density  estimate.  Statistics  and  Prob.  Letters  3:65-69  (1985) 

(42)  Groeneboom,  P  Estimating  a  monotone  density.  Proceedings  of  the  Berke¬ 
ley  Conference  in  Honor  of  Jerz  Nevman  and  Jack  Kiefer  vol  2:539-555  (1985) 

(43)  Hall,  P.  Cross-validation  and  the  smoothing  of  orthogonal  series  density 
estimators.  J.  of  Multi.  Analysis  21:189-206  (1987) 

(44)  -  On  Kullback-Leibler  loss  and  density  estimation.  Annals  of  Stat, 

15:1491-1519  (1987) 


168 


(45) 


On  the  use  of  the  compactly  supported  density  estimates  in  prob¬ 


lems  of  discrimination.  J.  of  Multi.  Analysis  23:131-158  (1987) 

(46)  - -  On  the  rate  of  convergence  of  orthogonal  series  density  estimators. 

Journal  of  the  Royal  Statistical  Society  Series  D  48:115-122  (1986) 

(47)  Hall,  P.;  Marion,  J.  S.  On  the  amount  of  noise  inherent  in  bandwidth 
selection  for  a  kernel  density  estimator.  Annals  of  Stat,.  15:163-181  (1987) 

(48)  - Extent  to  which  least-squares  cross-validation  minimizes  integrated 

square  error  in  nonparametric  density  estimation.  Prob.  Theory  74:567-581  (1987) 

(49)  Hall,  Peter;  Watson,  G.  S.;  Cabrera,  J.  Kernel  density  estimation  with 
spherical  data.  Biometrika  74:751-762  (1987) 

(50) Harter,  H.  L.;  Khamis,  H.  T.  and  Lamb,R.E.  Modified  K-S  tests  of  good¬ 
ness  of  fit.  Comm  Stat.  Simula  Comput  13:293-323  (1984) 

(51) Harter,  H.  L.;  Moore,  A.  H.  Maximum  likelihood  estimation  of  the  param¬ 
eters  of  gamma  and  Weibull  populations  from  complete  and  from  censored  samples. 
Technometrics  7  No.  4:639-643  (1965) 

(52) Hobbs,  J.  R.;Moore,  A.  H.  and  James,  W.  Minimum  distance  estima¬ 
tion  of  the  three  parameters  gamma  distribution.  IEEE  Transactions  on  Reliability 
33  No. 3:237-  240  (1984) 

(53) Hobbs,  J.  R.;Moore,  A.  H.  and  Miller,  R.  M.  Minimum  distance  estimation 
of  the  parameters  of  the  three  parameter  Weibull  distribution.  IEEE  Transactions 


169 


on  Reliability  34  No.  5:495-490  (1985) 


(54)  Hosmane,  Balakrishna  Improved  likelihood  ratio  test  for  multinomial 
goodness  of  fit.  Comm,  in  St  at.  A  10:  3185-3198  (1987) 

(55)  Jinadasa,  K.  G.  Maximum  likelihood  estimation  with  additional  data. 
Joint  Stat.  Meeting,  ASA  148(/l  Annual  meeting.  Biometric  Society  Eastern  and 
Western  North  Amer.  Region  (1988) 

(50)  Johnson  E.  G.;  Routledge  The  line  transect  method:  A  nonparametric 
est  imator  based  on  shape  restrictions.  Biometrics  41:009-079  (1985) 

(57)  Kappenman,  R.  S.  A  nonparametric  data  based  univariate  density  function 
estimate.  Comp.  Stat.  and  Data  Analysis  5:1-7  (1987) 

(58)  Kraft,  C.  H.;  Lepage, Y.;  Eeden  van,  C.  Estimation  of  a  symmetric  density 
function.  Communications  in  Statistics  A  14:273-288  (1985) 

(59)  Kullback,  Sulomon  The  Kullback-Leibler  distance.  Amer.  Statistician 
44:340-341  (1987) 

(00)  Kumar,  T.  K.;  Markmann,  J.  M.  Estimation  of  probability  density  func¬ 
tion  :  A  Monte  Carlo  comparison  of  parametric  and  nonparametric  methods. 

(01)  Littelle,  R.  D.  et  al  Goodness  of  fit  tests  for  the  two  parameter  distribution. 
Comm. Stat.  Simul  Comp  B  8:257-269 

(62)  Lilliefors,H ,W  On  the  Kolmogorov  test  for  the  exponential  distribution  with 
mean  unknown  JASA  64:387-389  (1969) 


170 


(63) 


On  the  Kolmogorov  test  for  normality  with  mean  and  variance 


unknown  JASA  62:143-147  (1967) 

(64)  - The  chi  square  goodness-of-fit  test  revisited.  Joint  Stat.  Meeting 

■  ASA  \A8th  Annual  meeting,  Biometric  Society  Eastern  and  Western  North  Amen  Region 
(1988) 

(65)  Luong,  A.;  Thompson,  M.  E.  Minimum-distance  methods  based  on  quadratic 
distances  for  transforms.  Canad.  .1.  of  Stat.  15:239-251  (1987) 

(66)  Man,  N.  R.  Point  and  interval  estimation  procedures  for  the  two  parameter 
VVeibull  and  extreme  value  distributions.  Technometrics  10:231-256  (1968) 

(67)  Marshall,  A.  W.;  Proschan,  F.  Maximum  likelihood  estimation  for  distri¬ 
butions  with  monotone  failure  rate.  Annals  of  Math.  Stat.  36:69-77  (1965) 

(68)  McCune,  E.  D.;  McCune,  S.  K.  Modified  nonnegative  kernel  estimator  of 
the  failure  rate  function.  Amer.  Stat,  Assoc.,  Proc.  of  Stat,  Computing  Section:353- 
354  (1987) 

(69)  - On  improving  convergence  rates  for  nonnegative  kernel  failure-rate 

function  estimators.  Stat.  and  Prob.  Letters  6:71-76  (1987) 

(70)  Moore,  A.  H.  A  modified  Kolmogorov-Smirnov  test  for  Weibull  distribu¬ 
tions  with  unknown  location  and  scale  parameters.  IEEE  Transactions  on  Reliability 
23:109-  213  (1983) 

(71)  - Modified  Cramer- von  Mises  and  Anderson-  Darling  test  for  Weibull 


171 


distribution  with  unknown  location  and  scale  parameters.  Communications  in  Statis¬ 
tics  -  Theoretical  Methods  12(21  ):2465- 2476  (1984) 

(72)  - Modified  goodness-of-fit  tests  for  gamma  distributions  with  un¬ 

known  location  and  scale  parameters.  IEEE  Transactions  on  Reliability  33:241-245 
(1984) 

(73)  - — —  A  Monte  Carlo  technique  for  estimating  lower  confidence  limits  on 
system  reliability  using  pass-fail  data.  IEEE  Transactions  on  Reliability  32:306-369 
(Oct. 1983) 

(74)  — — —  Minimum-distance  estimation  of  the  three  parameters  of  the  gamma 
distribution.  IEEE  Transactions  on  Reliability  32:237-240  (Aug. 1984) 

(75)  - Minimum-distance  estimation  of  the  parameters  of  the  3-parameter 

Wei  bull  distribution.  IEEE  Transactions  on  Reliability  34:495-496  (Dec. 1985) 

(76)  -  A  Monte  Carlo  method  for  determining  confidence  bounds  on  re¬ 

liability  and  availability  of  maintained  systems.  IEEE  Transactions  on  Reliability 
34:497-498  (Dec.  1985) 

(77)  - Modified  goodness-of-fit  tests  for  logistic  distribution  with  unknown 

location  and  scale  parameters.  Comm,  in  Stat.  -  Simula.  Computa.  15(l):77-83  (1986) 

(78)  -  Modified  goodness-of-fit  test  for  the  Laplace  distribution.  Comm. 

in  Stat.  -  Simulation  17(1):275-  281  (1988) 

(79)  - —  Extension  of  Monte  Carlo  techniques  for  obtaining  system  relia- 


172 


bilty  confidence  limits  from  component  test  data.  Proceedings  of  national  Aerospace 


electronic  conference:459-463  (19C5) 

(80) - Robust  statistical  inference  Notes  from  a  short  course  presented  at 

the  Air  Force  Institute  of  Technology,  Wright  Patterson  Air  Force  Base,  Ohio  (1081) 

(Sl)Moore,  A.  II.;  Ream,  .1.  J.  and  Woodruff,  B.  W.  A  new  goodness  of  fit 
tests  for  normality  with  mean  and  variance  unknown,  (submitted  for  publication) 

(82)  More  ,  J.  J.;  et  al.  User  Guide  for  Minipack- 1  Argonne  National  Labora¬ 
tory. 

(83)  Nadaraja,  E.  On  nor  parametric  estimation  of  density  function  and  regres¬ 
sion.  Theory  Prob.  Appl.  10:186-190  (1965) 

(84)  Ortega,  J.  M.;  Rheinboldt,  W.  C.  Iterative  Solution  of  Nonlinear  equations 
in  Several  Variables.  Academic  Press.  New  York  and  London.  (1970) 

(85)  Paraska  Rao,  B.  L.  S.  — underlineNonparametric  functional  estimation. 
Academic  Press.  (1983) 

(86)  Parr,  W.  C.  and  Schucany,  W.  R.  Minimum  distance  and  robust  estima¬ 
tion.  JASA  75  No  3:616-624  (1980) 

(87)  Parzen,  E.  On  estimation  of  a  probability  function  and  mode.  Annals  of 
Math.  Stat.  33:1065-1076  (1962) 

(88)  Powell,  M.  J.  D.  ”  A  hybrid  method  for  nonlinear  equations,”  Numerical  methods 
for  Nonlinear  Algebraic  equations  ,  P.  Rabinwitz  editor  (1970) 


173 


(89)  Powell,  M.  J.  D.  Approximation  theory  and  methods  Cambridge  Univer¬ 


sity  Press,  Cambridge  (1981) 

(90)  Porter,  J.  E.;  A.  II.  Moore  et  al  Modified  Kolmogorov,  AD  and  CvM  tests 
for  Pareto  distribution  with  unknown  location  and  scale  parameters. (submitted  for 
publication) 

(91)  Ramberg,  J.  S.,  et.  al.  A  probability  distribution  and  its  use  in  fitting 
data.  Technometrics  21:201-214  (1979) 

(92)  Reklaitis,  G.  V.  Engineering  optimization.  Methods  and  applications  John 
Wiley  Sons  (19S3) 

(93)  Revesz,  P.  Density  estimation.  Handbook  of  Statistics  vol  4:531-549  (1984) 

(94)  Rock,  N.  M.  S.  NPSTAT:  A  Fortran-77  program  to  perform  nonparametric 
variable-by-variable  comparisons  on  two  or  more  independent  groups  of  data.  Comp. 
and  Geosc.  12:757-777  (1986) 

(95)  - ROBUST:  An  interactive  Fortarn-77  package  for  exploratory  data 

analysis  using  parametric,  robust  and  nonparametric  location  and  scale  estimates, 
data  transformations,  normality  tests  and  outlier  assessment.  Comp,  and  Geosc. 
13:463-494  (1987) 

(96)  Schuster,  E.  F.  Note  on  uniform  convergence  of  density  estimates.  Annals 
of  Math.  St.at.  41:1347-1348  (1970) 

(97)  Scoff,  David  W.  Choosing  smoothing  parameters  for  density  estimators. 


174 


computer  Sc.  and  Stat.  .Proceedings  of  the  Seventeenth  Symposium:225-229  (198G) 


(98)  Sheather,  Simon  J.  An  improved  criteria  for  choosing  the  window  width 
when  estimating  the  density  at  a  point.  Comp.  Stat.  and  Data  Analysis  -1:61 -G5 
(1986) 

(99)  Silverman,  D.  W.  Density  estimation  for  statistics  and  data  analysis.  Chap¬ 
man  and  Hall  Ltd  (198G) 

(100)  Silverman,  B.  W.;  Young,  G.  A.  The  bootstrap:  To  smooth  or  not  to 
smooth?  Biomet rika  74:469-479  (1987) 

(101)  Stephens,  M.  E.D.F.  statistics  for  goodness  of  fit.  .1.  Amer.  Statist.  69:730- 
737  (1974) 

(102)  Sweeder,  J.  Nonparametric  estimation  of  distribution  and  density  fnne- 
tions  with  applications.  Ph.D.  Dissertation  WPAF  Base,  Ohio  AFIT  1982 

(103)  Tarter,  Michael  E.;  Freeman,  William;  Hopkins,  Alan  A  Fortran  imple¬ 
mentation  of  univariate  Fourier  series  density  estimation.  Comm,  in  Stat.  B  15:855- 
870  (1986) 

(104)  Taylor,  Malcolm  S  ;Thompson,  .lames.  R  A  data  based  algorithm  for  the 
generation  of  random  vectors.  Comp.  Stat. and  Data  Analysis  A  4:93-101  (1986) 

(105)  Wahba,  Grace  Optimal  smoothing  of  density  estimates.  Annals  of  Math. 
Stat ,:423-452  (1983) 


(106)  Wakimoto,  Kazumasa  et  al.  Testing  the  goodness  of  fit  of  the  multinomial 


distribution  based  on  graphical  representation.  Comp.  Stat .  and  Data  Analysis  A 
5:137-  147  (1987) 

(107)  Walter,  G.;  Blum,  J.  R.  Probability  density  estimation  using  delta  se¬ 
quences.  Annals,  of  Stat.  7:328-340  (1979) 

(108)  Wasan,  M.  T.  Parametric  Estimation  McGraw  Hill  Book  Company. (1970) 

(109)  Watson,  G.  S.  and  Leadbetter,  M.  R. Hazard  analysis  I.  Biometrika  31 :175- 
184  (1964) 

(110)  Wegtnan,  E.  J.  Density  estimation.  Encyclopedia  of  Stat  Science  vol  2:309- 
315  (1982) 

(111)  Wolfowitz,  J.  The  minimum  distance  method.  Annals,  of  Mat  h,  Slat,  28:75- 
88  (1957) 

(112)  Wolfowitz,  J.  Estimation  by  the  minimum  distance  method.  Annals,  of  the  Inst 
of  Stat.  Math.  5:9-23  (1953) 

(113)  Woodruff,  B.  W;  A.  II.  Moore  et  al  A  modified  K-S  test  for  Weibul!  distri¬ 
bution  with  unknown  location  and  scale  parameters.  IEEE  transactions  on  Rcliability:209- 
213  (1983) 

(114)  Woodruff,  B.  W.;  A.  H.  Moore  et  al  Modified  goodness  fo  fit  tests  for 
logistic  distribution  with  unknown  location  and  scale  parameters. 

Stat.  Simula  Comp  15(1) :77-83  (1986) 

(115)  Woodruff,  B.  W;  A.  H.  Moore  et  al  A  new  goodness  of  fit  test  lor  the 


176 


uniform  with  unspecified  parameters,  (submitted  for  publication) 

( 1 1 G )  Woodruff,  B.  W A.  H.  Moore  et  al  Modified  goodness  fo  fit  tests  for 
gamma  distributions  wth  unknown  location  and  scale  parameters.  j_EEE  transactions 
on  Reliability  33:241-245  (1984) 

(117)  Yen,  V.C  and  Moore,  A.  H  Modified  goodness  fo  fit  tests  for  the  Laplace 
distribution. (submitted  for  publication) 


177 


Vita 


Col.  Ahmed  Mohamed  M.  Sultan 

He  graduated  as  an  electrical  engineer  in  1975  from  the  Military  Technical  College 
(MTC)  of  Cairo.  Upon  the  receipt  of  his  degree  he  served  as  an  electrical  engineer 
for  aircraft  electrical  and  special  equipment  and  instrument.  Latter  on,  his  interest 
started  to  increase  in  Operations  Research  (O.R).  In  1977  he  joined  the  Institute 
of  Statistical  Studies  and  Research  (ISSR)  for  a  two  year  dij  loma  in  O.R.  In  1979 
he  received  his  diploma  degree  from  the  ISSR  and  worked  his  graduation  project  in 
planning  power  supply  for  a  new  under  developed  city  in  Egypt.  In  1977  and  upon 
the  receipt  of  his  diploma  degree  he  joined  the  M.S  program  in  O.R  for  one  year  in 
the  same  school  (ISSR).  The  M.S  program  in  ISSR  is  a  one  year  of  courses  and  a 
thesis.  |  v 

On  finishing  his  first  year  of  courses  in  ISSR  and  in  Summer  of  1981  he  was 

Selected  on  a. competitive  basis  to  join  a  M.S.  Program  in  O.R  at  the  Air  Force 

Institute  of  Technology  (AFIT).  In  December  of  1982  he  received  his  M.S  in  O.R. 

He  worked  his  M.S  thesis  on  Robust  Multiple  Linear  Regression,  where  an  extensive 

Monte  Carlo  analysis  was  conducted  to  determine  the  performance  of  robust  linear 

regression  techniques  with  and  without  outliers. 

» 

Upon  the  receipt  of  his  M.S  he  worked  for  the  Operations  Department  of  the 
Egyptian  Air  Force  as  an  analyst.  In  addition  he  worked  part  time  for  the  M.T.C 
teaching  O.R.  and  Probability  and  Statistics. 


178 


Latter  lie  was  chosen  to  work  as  a  mathematics  instructor  in  the  Egyptian  Air 
Academy  and  sent  for  a  Ph.D  in  statistics  from  AFIT. 

Permanent  address:  9-4  Wassef  St.  Ein  Shams 
CAIRO  ,  EGYPT 


179 


SECURITY  CLASSIFICATION  OF  THIS  PAGE 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
OMB  N o.  0704-0188 


la  REPORT  SECURITY  CLASSIFICATION 
Unclassified 


2a.  SECURITY  CLASSIFICATION  AUTHORITY 


2b.  DECLASSIFICATION /DOWNGRADING  SCHEDULE 


4.  PERFORMING  ORGANIZATION  REPORT  NUM8ER(S) 
AFIT/DS/ENC/90-1 


6a.  NAME  OF  PERFORMING  ORGANIZATION 
Air  Force  Institue  of 
Technology 


6c  ADDRESS  (City,  State,  and  ZIP  Code) 
Wright-Patterson  AFB  OH  45433 


6b.  OFFICE  SYMBOL 
(If  applicable) 


8a.  NAME  OF  FUNDING  /  SPONSORING 
ORGANIZATION 


8b.  OFFICE  SYMBOL 
(If  applicable) 


lb  RESTRICTIVE  MARKINGS 


3.  DISTRIBUTION /AVAILABILITY  OF  REPORT 
Approved  for  Public  RElease 
Distribution  Unlimited 


5.  MONITORING  ORGANIZATION  REPORT  NUMBER(S) 


7a.  NAME  OF  MONITORING  ORGANIZATION 


7b.  ADDRESS  (City,  State,  and  ZIP  Code) 


9  PROCUREMENT  INSTRUMENT  IDENTIFICATION  NUMBER 


8c.  ADDRESS  (City,  State,  and  ZIP  Code) 


'0  SOURCE  OF  FUNDING  NUMBERS 


PROGRAM 
ELEMENT  NO. 


PROJECT 

TASK 

NO. 

NO 

WORK  UNIT 
ACCESSION  NO. 


1 1 .  TITLE  (Include  Security  Classification) 

Applications  of  Non-parametric  Density  Estimation 


12.  PERSONAL  AUTHOR(S) 

Ahmed  M.M.  Sultan,  Lt  Col  Egyptian  Air  Force 


13a.  TYPE  OF  REPORT  13b.  TIME  COVERED 

Ph.D.  Disertation  from  10/5/87  TO 


14.  DATE  OF  REPORT  (Year,  Month,  Day)  15.  PAGE  COUNT 
February  28,  1990  179 


FIELD 

GROUP 

12 

3 

COSATI  CODES 


SUB-GROUP 


18.  SUBJECT  TERMS  ( Continue  on  reverse  if  necessary  and  identify  by  block  number) 
Distance  Estimation,  Goodness  of  Fit  Tests  Non-Parmetric 
Density  Estimation 


19.  ABSTRACT  ( Continue  on  reverse  if  necessary  and  identify  by  block  number) 

The  dissertation  examines  various  methods  of  nonparametric  density  estimation,  and  non- 
parametric  kernel  estimation  in  more  detail.  The  consequences  of  various  kernel  window  width 
and  their  effect  on  the  mean  integrated  square  error  are  examined  using  Monte  Carlo  techniques 
The  mean  and  the  variance  of  nonparametric  density  estimator  is  derived  for  symmetric 
kernels  with  finite  mean  and  finite  variance.  The  results  also  treat  kernels  with  varying 
window  parameters. 

The  nonparametric  kernel  estimate  was  used  to  obtain  new  estimators  for  the  three  para¬ 
meter  Weibull  distribution  using  distance  estimation  and  the  Cramervon-Mises  statistic. 
Comparison  with  maximum  likelihood  estimators  using  a  Monte  Carlo  sample  of  size  1000  and 
various  different  parameters  showed  a  significant  improvement  over  the  maximum  liklihood 
estimators  in  the  mean  integrated  square  error  between  the  estimated  distribution  and  the  true 
distribution. 


20.  DISTRIBUTION /AVAILABILITY  OF  ABSTRACT 
□  UNCLASSIFICD/UNLIMITED  □  SAME  AS  RPT. 


22a.  NAME  OF  RESPONSIBLE  INDIVIDUAL 
Dr.  ALbert  H.  MOore 


DO  Form  1473,  JUN  86 


21.  ABSTRACT  SECURITY  CLASSIFICATION 
□  dtic  USERS  Unclassified 


22b.  TELEPHONE  (Include  Area  Code)  22c.  OFFICE  SYMBOL 
(513)255-3098  AFIT/ENC 


Previous  editions  are  obsolete. 


SECURITY  CLASSIFICATION  OF  THIS  PAGE 


Unclassified 


Serveral  new  goodness  of  fit  tests  are  proposed  using  the  nonparametric  kernel 
estimator  and  the  Cramer-von-Mises  and  the  Anderson  Darling  statistics.  Extensive 
Monte  Carlo  experiments  were  performed  to  obtain  the  critical  values  for  the  test  and 
to  study  the  power  of  the  tests  against  eight  alternative  distributions .  The  tests 
using  the  Anderson  Darling  statistic  showed  greater  power  against  almost  all  alternative 
distributions  studied  than  the  K.S.  test. 

A  new  nonparametric  kernel  estimator  was  introduced  hy  varying  the  window  width  in 
each  tail  portion  of  the  sample.  The  method  permitted  different  window  width  in  each  tail 
portion  and  in  the  center  portion  of  the  sample.  The  method  uses  separately  the  sample 
percentile  ratios  as  a  measure  of  each  tail  length.  The  kernel  parameter  for  the  tail  sample 
values  is  chosen  using  sample  percentile  ratios  for  that  tail.  The  nonparametric  kernel 
estimator  results  in  comparable  mean  integrated  errors  with  the  estimators  developed  earlier. 


Unclassified 


I 


